2025-06-01 12:27:15 +00:00
|
|
|
# SmartProxy Socket Cleanup Fix Plan
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
## Problem Summary
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
The current socket cleanup implementation is too aggressive and closes long-lived connections prematurely. This affects:
|
|
|
|
- WebSocket connections in HTTPS passthrough
|
|
|
|
- Long-lived HTTP connections (SSE, streaming)
|
|
|
|
- Database connections
|
|
|
|
- Any connection that should remain open for hours
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
## Root Causes
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### 1. **Bilateral Socket Cleanup**
|
|
|
|
When one socket closes, both sockets are immediately destroyed:
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
// In createSocketCleanupHandler
|
|
|
|
cleanupSocket(clientSocket, 'client');
|
|
|
|
cleanupSocket(serverSocket, 'server'); // Both destroyed together!
|
2025-05-31 17:14:15 +00:00
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### 2. **Aggressive Timeout Handling**
|
|
|
|
Timeout events immediately trigger connection cleanup:
|
2025-05-31 17:14:15 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
socket.on('timeout', () => {
|
|
|
|
handleClose(`${prefix}_timeout`); // Destroys both sockets!
|
|
|
|
});
|
2025-05-31 17:14:15 +00:00
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### 3. **Parity Check Forces Closure**
|
|
|
|
If one socket closes but the other remains open for >2 minutes, connection is forcefully terminated:
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
if (record.outgoingClosedTime &&
|
|
|
|
!record.incoming.destroyed &&
|
|
|
|
now - record.outgoingClosedTime > 120000) {
|
|
|
|
this.cleanupConnection(record, 'parity_check');
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### 4. **No Half-Open Connection Support**
|
|
|
|
The proxy doesn't support TCP half-open connections where one side closes while the other continues sending.
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
## Fix Implementation Plan
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Phase 1: Fix Socket Cleanup (Prevent Premature Closure)
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 1.1 Modify `cleanupSocket()` to support graceful shutdown
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
export interface CleanupOptions {
|
|
|
|
immediate?: boolean; // Force immediate destruction
|
|
|
|
allowDrain?: boolean; // Allow write buffer to drain
|
|
|
|
gracePeriod?: number; // Ms to wait before force close
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
export function cleanupSocket(
|
|
|
|
socket: Socket | TLSSocket | null,
|
|
|
|
socketName?: string,
|
|
|
|
options: CleanupOptions = {}
|
|
|
|
): Promise<void> {
|
|
|
|
if (!socket || socket.destroyed) return Promise.resolve();
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
return new Promise<void>((resolve) => {
|
|
|
|
const cleanup = () => {
|
|
|
|
socket.removeAllListeners();
|
|
|
|
if (!socket.destroyed) {
|
|
|
|
socket.destroy();
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
2025-06-01 12:27:15 +00:00
|
|
|
resolve();
|
|
|
|
};
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
if (options.immediate) {
|
|
|
|
cleanup();
|
|
|
|
} else if (options.allowDrain && socket.writable) {
|
|
|
|
// Allow pending writes to complete
|
|
|
|
socket.end(() => cleanup());
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
// Force cleanup after grace period
|
|
|
|
if (options.gracePeriod) {
|
|
|
|
setTimeout(cleanup, options.gracePeriod);
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
2025-06-01 12:27:15 +00:00
|
|
|
} else {
|
|
|
|
cleanup();
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
2025-06-01 12:27:15 +00:00
|
|
|
});
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 1.2 Implement Independent Socket Tracking
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
export function createIndependentSocketHandlers(
|
|
|
|
clientSocket: Socket,
|
|
|
|
serverSocket: Socket,
|
|
|
|
onBothClosed: (reason: string) => void
|
|
|
|
): { cleanupClient: () => void, cleanupServer: () => void } {
|
|
|
|
let clientClosed = false;
|
|
|
|
let serverClosed = false;
|
|
|
|
let clientReason = '';
|
|
|
|
let serverReason = '';
|
|
|
|
|
|
|
|
const checkBothClosed = () => {
|
|
|
|
if (clientClosed && serverClosed) {
|
|
|
|
onBothClosed(`client: ${clientReason}, server: ${serverReason}`);
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
2025-06-01 12:27:15 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
const cleanupClient = async (reason: string) => {
|
|
|
|
if (clientClosed) return;
|
|
|
|
clientClosed = true;
|
|
|
|
clientReason = reason;
|
|
|
|
|
|
|
|
// Allow server to continue if still active
|
|
|
|
if (!serverClosed && serverSocket.writable) {
|
|
|
|
// Half-close: stop reading from client, let server finish
|
|
|
|
clientSocket.pause();
|
|
|
|
clientSocket.unpipe(serverSocket);
|
|
|
|
await cleanupSocket(clientSocket, 'client', { allowDrain: true });
|
|
|
|
} else {
|
|
|
|
await cleanupSocket(clientSocket, 'client');
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
checkBothClosed();
|
|
|
|
};
|
|
|
|
|
|
|
|
const cleanupServer = async (reason: string) => {
|
|
|
|
if (serverClosed) return;
|
|
|
|
serverClosed = true;
|
|
|
|
serverReason = reason;
|
|
|
|
|
|
|
|
// Allow client to continue if still active
|
|
|
|
if (!clientClosed && clientSocket.writable) {
|
|
|
|
// Half-close: stop reading from server, let client finish
|
|
|
|
serverSocket.pause();
|
|
|
|
serverSocket.unpipe(clientSocket);
|
|
|
|
await cleanupSocket(serverSocket, 'server', { allowDrain: true });
|
|
|
|
} else {
|
|
|
|
await cleanupSocket(serverSocket, 'server');
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
checkBothClosed();
|
2025-05-28 23:33:02 +00:00
|
|
|
};
|
2025-06-01 12:27:15 +00:00
|
|
|
|
|
|
|
return { cleanupClient, cleanupServer };
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
2025-05-31 17:14:15 +00:00
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Phase 2: Fix Timeout Handling
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 2.1 Separate timeout handling from connection closure
|
2025-05-31 17:14:15 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
export function setupSocketHandlers(
|
|
|
|
socket: Socket | TLSSocket,
|
|
|
|
handleClose: (reason: string) => void,
|
|
|
|
handleTimeout?: (socket: Socket) => void, // New optional handler
|
|
|
|
errorPrefix?: string
|
|
|
|
): void {
|
|
|
|
socket.on('error', (error) => {
|
|
|
|
const prefix = errorPrefix || 'Socket';
|
|
|
|
handleClose(`${prefix}_error: ${error.message}`);
|
|
|
|
});
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
socket.on('close', () => {
|
|
|
|
const prefix = errorPrefix || 'socket';
|
|
|
|
handleClose(`${prefix}_closed`);
|
|
|
|
});
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
socket.on('timeout', () => {
|
|
|
|
if (handleTimeout) {
|
|
|
|
handleTimeout(socket); // Custom timeout handling
|
|
|
|
} else {
|
|
|
|
// Default: just log, don't close
|
|
|
|
console.warn(`Socket timeout: ${errorPrefix || 'socket'}`);
|
|
|
|
}
|
|
|
|
});
|
|
|
|
}
|
2025-05-31 17:14:15 +00:00
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 2.2 Update HTTPS passthrough handler
|
2025-05-31 17:14:15 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
// In https-passthrough-handler.ts
|
|
|
|
const { cleanupClient, cleanupServer } = createIndependentSocketHandlers(
|
|
|
|
clientSocket,
|
|
|
|
serverSocket,
|
|
|
|
(reason) => {
|
|
|
|
this.emit(ForwardingHandlerEvents.DISCONNECTED, {
|
|
|
|
remoteAddress,
|
|
|
|
bytesSent,
|
|
|
|
bytesReceived,
|
|
|
|
reason
|
|
|
|
});
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
2025-06-01 12:27:15 +00:00
|
|
|
);
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
// Setup handlers with custom timeout handling
|
|
|
|
setupSocketHandlers(clientSocket, cleanupClient, (socket) => {
|
|
|
|
// Just reset timeout, don't close
|
|
|
|
socket.setTimeout(timeout);
|
|
|
|
}, 'client');
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
setupSocketHandlers(serverSocket, cleanupServer, (socket) => {
|
|
|
|
// Just reset timeout, don't close
|
|
|
|
socket.setTimeout(timeout);
|
|
|
|
}, 'server');
|
2025-05-28 23:33:02 +00:00
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Phase 3: Fix Connection Manager
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 3.1 Remove aggressive parity check
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
// Remove or significantly increase the parity check timeout
|
|
|
|
// From 2 minutes to 30 minutes for long-lived connections
|
|
|
|
if (record.outgoingClosedTime &&
|
|
|
|
!record.incoming.destroyed &&
|
|
|
|
!record.connectionClosed &&
|
|
|
|
now - record.outgoingClosedTime > 1800000) { // 30 minutes
|
|
|
|
// Only close if no data activity
|
|
|
|
if (now - record.lastActivity > 600000) { // 10 minutes of inactivity
|
|
|
|
this.cleanupConnection(record, 'parity_check');
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 3.2 Update cleanupConnection to check socket states
|
2025-05-28 23:33:02 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
public cleanupConnection(record: IConnectionRecord, reason: string = 'normal'): void {
|
|
|
|
if (!record.connectionClosed) {
|
|
|
|
record.connectionClosed = true;
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
// Only cleanup sockets that are actually closed or inactive
|
|
|
|
if (record.incoming && (!record.incoming.writable || record.incoming.destroyed)) {
|
|
|
|
cleanupSocket(record.incoming, `${record.id}-incoming`, { immediate: true });
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
if (record.outgoing && (!record.outgoing.writable || record.outgoing.destroyed)) {
|
|
|
|
cleanupSocket(record.outgoing, `${record.id}-outgoing`, { immediate: true });
|
2025-05-31 17:14:15 +00:00
|
|
|
}
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
// If either socket is still active, don't remove the record yet
|
|
|
|
if ((record.incoming && record.incoming.writable) ||
|
|
|
|
(record.outgoing && record.outgoing.writable)) {
|
|
|
|
record.connectionClosed = false; // Reset flag
|
|
|
|
return; // Don't finish cleanup
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
// Continue with full cleanup...
|
2025-05-28 23:33:02 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Phase 4: Testing and Validation
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 4.1 Test Cases to Implement
|
|
|
|
1. WebSocket connection should stay open for >1 hour
|
|
|
|
2. HTTP streaming response should continue after request closes
|
|
|
|
3. Half-open connections should work correctly
|
|
|
|
4. Verify no socket leaks with long-running connections
|
|
|
|
5. Test graceful shutdown with pending data
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
#### 4.2 Socket Leak Prevention
|
|
|
|
- Ensure all event listeners are tracked and removed
|
|
|
|
- Use WeakMap for socket metadata to prevent memory leaks
|
|
|
|
- Implement connection count monitoring
|
|
|
|
- Add periodic health checks for orphaned sockets
|
|
|
|
|
|
|
|
## Implementation Order
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
1. **Day 1**: Implement graceful `cleanupSocket()` and independent socket handlers
|
|
|
|
2. **Day 2**: Update all handlers to use new cleanup mechanism
|
|
|
|
3. **Day 3**: Fix timeout handling to not close connections
|
|
|
|
4. **Day 4**: Update connection manager parity check and cleanup logic
|
|
|
|
5. **Day 5**: Comprehensive testing and leak detection
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
## Configuration Changes
|
2025-05-31 17:14:15 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
Add new options to SmartProxyOptions:
|
2025-05-31 17:14:15 +00:00
|
|
|
```typescript
|
2025-06-01 12:27:15 +00:00
|
|
|
interface ISmartProxyOptions {
|
|
|
|
// Existing options...
|
|
|
|
|
|
|
|
// New options for long-lived connections
|
|
|
|
socketCleanupGracePeriod?: number; // Default: 5000ms
|
|
|
|
allowHalfOpenConnections?: boolean; // Default: true
|
|
|
|
parityCheckTimeout?: number; // Default: 1800000ms (30 min)
|
|
|
|
timeoutBehavior?: 'close' | 'reset' | 'ignore'; // Default: 'reset'
|
|
|
|
}
|
2025-05-31 17:14:15 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
## Success Metrics
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
1. WebSocket connections remain stable for 24+ hours
|
|
|
|
2. No premature connection closures reported
|
|
|
|
3. Memory usage remains stable (no socket leaks)
|
|
|
|
4. Half-open connections work correctly
|
|
|
|
5. Graceful shutdown completes within grace period
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
## Implementation Status: COMPLETED ✅
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Implemented Changes
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
1. **Modified `cleanupSocket()` in `socket-utils.ts`**
|
|
|
|
- Added `CleanupOptions` interface with `immediate`, `allowDrain`, and `gracePeriod` options
|
|
|
|
- Implemented graceful shutdown support with write buffer draining
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
2. **Created `createIndependentSocketHandlers()` in `socket-utils.ts`**
|
|
|
|
- Tracks socket states independently
|
|
|
|
- Supports half-open connections where one side can close while the other remains open
|
|
|
|
- Only triggers full cleanup when both sockets are closed
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
3. **Updated `setupSocketHandlers()` in `socket-utils.ts`**
|
|
|
|
- Added optional `handleTimeout` parameter to customize timeout behavior
|
|
|
|
- Prevents automatic connection closure on timeout events
|
2025-05-28 23:33:02 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
4. **Updated HTTPS Passthrough Handler**
|
|
|
|
- Now uses `createIndependentSocketHandlers` for half-open support
|
|
|
|
- Custom timeout handling that resets timer instead of closing connection
|
|
|
|
- Manual data forwarding with backpressure handling
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
5. **Updated Connection Manager**
|
|
|
|
- Extended parity check from 2 minutes to 30 minutes
|
|
|
|
- Added activity check before closing (10 minutes of inactivity required)
|
|
|
|
- Modified cleanup to check socket states before destroying
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
6. **Updated Basic Forwarding in Route Connection Handler**
|
|
|
|
- Replaced simple `pipe()` with independent socket handlers
|
|
|
|
- Added manual data forwarding with backpressure support
|
|
|
|
- Removed bilateral close handlers to prevent premature cleanup
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Test Results
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
All tests passing:
|
|
|
|
- ✅ Long-lived connection test: Connection stayed open for 61+ seconds with periodic keep-alive
|
|
|
|
- ✅ Half-open connection test: One side closed while the other continued to send data
|
|
|
|
- ✅ No socket leaks or premature closures
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
### Notes
|
2025-05-29 00:24:57 +00:00
|
|
|
|
2025-06-01 12:27:15 +00:00
|
|
|
- The fix maintains backward compatibility
|
|
|
|
- No configuration changes required for existing deployments
|
|
|
|
- Long-lived connections now work correctly in both HTTPS passthrough and basic forwarding modes
|