Improve error handling and logging for outgoing connections in RouteConnectionHandler

This commit is contained in:
Philipp Kunz 2025-06-01 13:58:20 +00:00
parent 705a59413d
commit 2d240671ab
2 changed files with 192 additions and 357 deletions

View File

@ -1,165 +0,0 @@
# SmartProxy Socket Handling Fix Plan
Reread CLAUDE.md file for guidelines
## Implementation Summary (COMPLETED)
The critical socket handling issues have been fixed:
1. **Prevented Server Crashes**: Created `createSocketWithErrorHandler()` utility that attaches error handlers immediately upon socket creation, preventing unhandled ECONNREFUSED errors from crashing the server.
2. **Fixed Memory Leaks**: Updated forwarding handlers to properly clean up client sockets when server connections fail, ensuring connection records are removed from tracking.
3. **Key Changes Made**:
- Added `createSocketWithErrorHandler()` in `socket-utils.ts`
- Updated `https-passthrough-handler.ts` to use safe socket creation
- Updated `https-terminate-to-http-handler.ts` to use safe socket creation
- Ensured client sockets are destroyed when server connections fail
- Connection cleanup now triggered by socket close events
4. **Test Results**: Server no longer crashes on ECONNREFUSED errors, and connections are properly cleaned up.
## Problem Summary
The SmartProxy server is experiencing critical issues:
1. **Server crashes** due to unhandled socket connection errors (ECONNREFUSED)
2. **Memory leak** with steadily rising active connection count
3. **Race conditions** between socket creation and error handler attachment
4. **Orphaned sockets** when server connections fail
## Root Causes
### 1. Delayed Error Handler Attachment
- Sockets created without immediate error handlers
- Error events can fire before handlers attached
- Causes uncaught exceptions and server crashes
### 2. Incomplete Cleanup Logic
- Client sockets not cleaned up when server connection fails
- Connection counter only decrements after BOTH sockets close
- Failed server connections leave orphaned client sockets
### 3. Missing Global Error Handlers
- No process-level uncaughtException handler
- No process-level unhandledRejection handler
- Any unhandled error crashes entire server
## Implementation Plan
### Phase 1: Prevent Server Crashes (Critical)
#### 1.1 Add Global Error Handlers
- [x] ~~Add global error handlers in main entry point~~ (Removed per user request - no global handlers)
- [x] Log errors with context
- [x] ~~Implement graceful shutdown sequence~~ (Removed - handled locally)
#### 1.2 Fix Socket Creation Race Condition
- [x] Modify socket creation to attach error handlers immediately
- [x] Update all forwarding handlers (https-passthrough, http, etc.)
- [x] Ensure error handlers attached in same tick as socket creation
### Phase 2: Fix Memory Leaks (High Priority)
#### 2.1 Fix Connection Cleanup Logic
- [x] Clean up client socket immediately if server connection fails
- [x] Decrement connection counter on any socket failure (handled by socket close events)
- [x] Implement proper cleanup for half-open connections
#### 2.2 Improve Socket Utils
- [x] Create new utility function for safe socket creation with immediate error handling
- [x] Update createIndependentSocketHandlers to handle immediate failures
- [ ] Add connection tracking debug utilities
### Phase 3: Comprehensive Testing (Important)
#### 3.1 Create Test Cases
- [x] Test ECONNREFUSED scenario
- [ ] Test timeout handling
- [ ] Test half-open connections
- [ ] Test rapid connect/disconnect cycles
#### 3.2 Add Monitoring
- [ ] Add connection leak detection
- [ ] Add metrics for connection lifecycle
- [ ] Add debug logging for socket state transitions
## Detailed Implementation Steps
### Step 1: Global Error Handlers (ts/proxies/smart-proxy/smart-proxy.ts)
```typescript
// Add in constructor or start method
process.on('uncaughtException', (error) => {
logger.log('error', 'Uncaught exception', { error });
// Graceful shutdown
});
process.on('unhandledRejection', (reason, promise) => {
logger.log('error', 'Unhandled rejection', { reason, promise });
});
```
### Step 2: Safe Socket Creation Utility (ts/core/utils/socket-utils.ts)
```typescript
export function createSocketWithErrorHandler(
options: net.NetConnectOpts,
onError: (err: Error) => void
): net.Socket {
const socket = net.connect(options);
socket.on('error', onError);
return socket;
}
```
### Step 3: Fix HttpsPassthroughHandler (ts/forwarding/handlers/https-passthrough-handler.ts)
- Replace direct socket creation with safe creation
- Handle server connection failures immediately
- Clean up client socket on server connection failure
### Step 4: Fix Connection Counting
- Decrement on ANY socket close, not just when both close
- Track failed connections separately
- Add connection state tracking
### Step 5: Update All Handlers
- [ ] https-passthrough-handler.ts
- [ ] http-handler.ts
- [ ] https-terminate-to-http-handler.ts
- [ ] https-terminate-to-https-handler.ts
- [ ] route-connection-handler.ts
## Success Criteria
1. **No server crashes** on ECONNREFUSED or other socket errors
2. **Active connections** remain stable (no steady increase)
3. **All sockets** properly cleaned up on errors
4. **Memory usage** remains stable under load
5. **Graceful handling** of all error scenarios
## Testing Plan
1. Simulate ECONNREFUSED by targeting closed ports
2. Monitor active connection count over time
3. Stress test with rapid connections
4. Test with unreachable hosts
5. Test with slow/timing out connections
## Rollback Plan
If issues arise:
1. Revert socket creation changes
2. Keep global error handlers (they add safety)
3. Add more detailed logging for debugging
4. Implement fixes incrementally
## Timeline
- Phase 1: Immediate (prevents crashes)
- Phase 2: Within 24 hours (fixes leaks)
- Phase 3: Within 48 hours (ensures stability)
## Notes
- The race condition is the most critical issue
- Connection counting logic needs complete overhaul
- Consider using a connection state machine for clarity
- Add connection lifecycle events for debugging

View File

@ -1074,19 +1074,13 @@ export class RouteConnectionHandler {
}
// Create the target socket with immediate error handling
let targetSocket: plugins.net.Socket;
let connectionEstablished = false;
// Flag to track if initial connection failed
let connectionFailed = false;
targetSocket = createSocketWithErrorHandler({
const targetSocket = createSocketWithErrorHandler({
port: finalTargetPort,
host: finalTargetHost,
onError: (error) => {
// Mark connection as failed
connectionFailed = true;
// Connection failed - clean up immediately
// Connection failed - clean up everything immediately
logger.log('error',
`Connection setup error for ${connectionId} to ${finalTargetHost}:${finalTargetPort}: ${error.message} (${(error as any).code})`,
{
@ -1099,6 +1093,20 @@ export class RouteConnectionHandler {
}
);
// Log specific error types for easier debugging
if ((error as any).code === 'ECONNREFUSED') {
logger.log('error',
`Connection ${connectionId}: Target ${finalTargetHost}:${finalTargetPort} refused connection. Check if the target service is running and listening on that port.`,
{
connectionId,
targetHost: finalTargetHost,
targetPort: finalTargetPort,
recommendation: 'Check if the target service is running and listening on that port.',
component: 'route-handler'
}
);
}
// Resume the incoming socket to prevent it from hanging
socket.resume();
@ -1107,120 +1115,12 @@ export class RouteConnectionHandler {
socket.destroy();
}
// Clean up the connection record
this.connectionManager.initiateCleanupOnce(record, `connection_failed_${(error as any).code || 'unknown'}`);
}
});
// Clean up the connection record - this is critical!
this.connectionManager.cleanupConnection(record, `connection_failed_${(error as any).code || 'unknown'}`);
},
onConnect: () => {
connectionEstablished = true;
// Only proceed with setup if connection didn't fail immediately
if (!connectionFailed) {
record.outgoing = targetSocket;
record.outgoingStartTime = Date.now();
// Apply socket optimizations
targetSocket.setNoDelay(this.settings.noDelay);
// Apply keep-alive settings if enabled
if (this.settings.keepAlive) {
targetSocket.setKeepAlive(true, this.settings.keepAliveInitialDelay);
// Apply enhanced TCP keep-alive options if enabled
if (this.settings.enableKeepAliveProbes) {
try {
if ('setKeepAliveProbes' in targetSocket) {
(targetSocket as any).setKeepAliveProbes(10);
}
if ('setKeepAliveInterval' in targetSocket) {
(targetSocket as any).setKeepAliveInterval(1000);
}
} catch (err) {
// Ignore errors - these are optional enhancements
if (this.settings.enableDetailedLogging) {
logger.log('warn', `Enhanced TCP keep-alive not supported for outgoing socket on connection ${connectionId}: ${err}`, {
connectionId,
error: err,
component: 'route-handler'
});
}
}
}
}
// Setup improved error handling for outgoing connection
this.setupOutgoingErrorHandler(connectionId, targetSocket, record, socket, finalTargetHost, finalTargetPort);
// Note: Close handlers are managed by independent socket handlers above
// We don't register handleClose here to avoid bilateral cleanup
// Setup error handlers for incoming socket
socket.on('error', this.connectionManager.handleError('incoming', record));
// Handle timeouts with keep-alive awareness
socket.on('timeout', () => {
// For keep-alive connections, just log a warning instead of closing
if (record.hasKeepAlive) {
logger.log('warn', `Timeout event on incoming keep-alive connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}. Connection preserved.`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
status: 'Connection preserved',
component: 'route-handler'
});
return;
}
// For non-keep-alive connections, proceed with normal cleanup
logger.log('warn', `Timeout on incoming side for connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
component: 'route-handler'
});
if (record.incomingTerminationReason === null) {
record.incomingTerminationReason = 'timeout';
this.connectionManager.incrementTerminationStat('incoming', 'timeout');
}
this.connectionManager.initiateCleanupOnce(record, 'timeout_incoming');
});
targetSocket.on('timeout', () => {
// For keep-alive connections, just log a warning instead of closing
if (record.hasKeepAlive) {
logger.log('warn', `Timeout event on outgoing keep-alive connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}. Connection preserved.`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
status: 'Connection preserved',
component: 'route-handler'
});
return;
}
// For non-keep-alive connections, proceed with normal cleanup
logger.log('warn', `Timeout on outgoing side for connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
component: 'route-handler'
});
if (record.outgoingTerminationReason === null) {
record.outgoingTerminationReason = 'timeout';
this.connectionManager.incrementTerminationStat('outgoing', 'timeout');
}
this.connectionManager.initiateCleanupOnce(record, 'timeout_outgoing');
});
// Apply socket timeouts
this.timeoutManager.applySocketTimeouts(record);
// Track outgoing data for bytes counting
targetSocket.on('data', (chunk: Buffer) => {
record.bytesSent += chunk.length;
this.timeoutManager.updateActivity(record);
});
// Wait for the outgoing connection to be ready before setting up piping
targetSocket.once('connect', () => {
if (this.settings.enableDetailedLogging) {
logger.log('info', `Connection ${connectionId} established to target ${finalTargetHost}:${finalTargetPort}`, {
connectionId,
@ -1230,7 +1130,7 @@ export class RouteConnectionHandler {
});
}
// Clear the initial connection error handler
// Clear any error listeners added by createSocketWithErrorHandler
targetSocket.removeAllListeners('error');
// Add the normal error handler for established connections
@ -1384,7 +1284,107 @@ export class RouteConnectionHandler {
if (record.isTLS) {
record.tlsHandshakeComplete = true;
}
}
});
// Only set up basic properties - everything else happens in onConnect
record.outgoing = targetSocket;
record.outgoingStartTime = Date.now();
// Apply socket optimizations
targetSocket.setNoDelay(this.settings.noDelay);
// Apply keep-alive settings if enabled
if (this.settings.keepAlive) {
targetSocket.setKeepAlive(true, this.settings.keepAliveInitialDelay);
// Apply enhanced TCP keep-alive options if enabled
if (this.settings.enableKeepAliveProbes) {
try {
if ('setKeepAliveProbes' in targetSocket) {
(targetSocket as any).setKeepAliveProbes(10);
}
if ('setKeepAliveInterval' in targetSocket) {
(targetSocket as any).setKeepAliveInterval(1000);
}
} catch (err) {
// Ignore errors - these are optional enhancements
if (this.settings.enableDetailedLogging) {
logger.log('warn', `Enhanced TCP keep-alive not supported for outgoing socket on connection ${connectionId}: ${err}`, {
connectionId,
error: err,
component: 'route-handler'
});
}
}
}
}
// Setup error handlers for incoming socket
socket.on('error', this.connectionManager.handleError('incoming', record));
// Handle timeouts with keep-alive awareness
socket.on('timeout', () => {
// For keep-alive connections, just log a warning instead of closing
if (record.hasKeepAlive) {
logger.log('warn', `Timeout event on incoming keep-alive connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}. Connection preserved.`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
status: 'Connection preserved',
component: 'route-handler'
});
return;
}
// For non-keep-alive connections, proceed with normal cleanup
logger.log('warn', `Timeout on incoming side for connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
component: 'route-handler'
});
if (record.incomingTerminationReason === null) {
record.incomingTerminationReason = 'timeout';
this.connectionManager.incrementTerminationStat('incoming', 'timeout');
}
this.connectionManager.initiateCleanupOnce(record, 'timeout_incoming');
});
targetSocket.on('timeout', () => {
// For keep-alive connections, just log a warning instead of closing
if (record.hasKeepAlive) {
logger.log('warn', `Timeout event on outgoing keep-alive connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}. Connection preserved.`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
status: 'Connection preserved',
component: 'route-handler'
});
return;
}
// For non-keep-alive connections, proceed with normal cleanup
logger.log('warn', `Timeout on outgoing side for connection ${connectionId} from ${record.remoteIP} after ${plugins.prettyMs(this.settings.socketTimeout || 3600000)}`, {
connectionId,
remoteIP: record.remoteIP,
timeout: plugins.prettyMs(this.settings.socketTimeout || 3600000),
component: 'route-handler'
});
if (record.outgoingTerminationReason === null) {
record.outgoingTerminationReason = 'timeout';
this.connectionManager.incrementTerminationStat('outgoing', 'timeout');
}
this.connectionManager.initiateCleanupOnce(record, 'timeout_outgoing');
});
// Apply socket timeouts
this.timeoutManager.applySocketTimeouts(record);
// Track outgoing data for bytes counting (moved from the duplicate connect handler)
targetSocket.on('data', (chunk: Buffer) => {
record.bytesSent += chunk.length;
this.timeoutManager.updateActivity(record);
});
} // End of if (!connectionFailed)
}
}