12 KiB
Connection Management in SmartProxy
This document describes connection handling, cleanup mechanisms, and known issues in SmartProxy, particularly focusing on proxy chain configurations.
Connection Accumulation Investigation (January 2025)
Problem Statement
Connections may accumulate on the outer proxy in proxy chain configurations, despite implemented fixes.
Historical Context
- v19.5.12-v19.5.15: Major connection cleanup improvements
- v19.5.19+: PROXY protocol support with WrappedSocket implementation
- v19.5.20: Fixed race condition in immediate routing cleanup
Current Architecture
Connection Flow in Proxy Chains
Client → Outer Proxy (8001) → Inner Proxy (8002) → Backend (httpbin.org:443)
-
Outer Proxy:
- Accepts client connection
- Sends PROXY protocol header to inner proxy
- Tracks connection in ConnectionManager
- Immediate routing for non-TLS ports
-
Inner Proxy:
- Parses PROXY protocol to get real client IP
- Establishes connection to backend
- Tracks its own connections separately
Potential Causes of Connection Accumulation
1. Race Condition in Immediate Routing
When a connection is immediately routed (non-TLS ports), there's a timing window:
// route-connection-handler.ts, line ~231
this.routeConnection(socket, record, '', undefined);
// Connection is routed before all setup is complete
Issue: If client disconnects during backend connection setup, cleanup may not trigger properly.
2. Outgoing Socket Assignment Timing
Despite the fix in v19.5.20:
// Line 1362 in setupDirectConnection
record.outgoing = targetSocket;
There's still a window between socket creation and the connect
event where cleanup might miss the outgoing socket.
3. Batch Cleanup Delays
ConnectionManager uses queued cleanup:
- Batch size: 100 connections
- Batch interval: 100ms
- Under rapid connection/disconnection, queue might lag
4. Different Cleanup Paths
Multiple cleanup triggers exist:
- Socket 'close' event
- Socket 'error' event
- Inactivity timeout
- Connection timeout
- Manual cleanup
Not all paths may properly handle proxy chain scenarios.
5. Keep-Alive Connection Handling
Keep-alive connections have special treatment:
- Extended inactivity timeout (6x normal)
- Warning before closure
- May accumulate if backend is unresponsive
Observed Symptoms
- Outer proxy connection count grows over time
- Inner proxy maintains zero or low connection count
- Connections show as closed in logs but remain in tracking
- Memory usage gradually increases
Debug Strategies
1. Enhanced Logging
Add connection state logging at key points:
// When outgoing socket is created
logger.log('debug', `Outgoing socket created for ${connectionId}`, {
hasOutgoing: !!record.outgoing,
outgoingState: record.outgoing?.readyState
});
2. Connection State Inspection
Periodically log detailed connection state:
for (const [id, record] of connectionManager.getConnections()) {
console.log({
id,
age: Date.now() - record.incomingStartTime,
incomingDestroyed: record.incoming.destroyed,
outgoingDestroyed: record.outgoing?.destroyed,
hasCleanupTimer: !!record.cleanupTimer
});
}
3. Cleanup Verification
Track cleanup completion:
// In cleanupConnection
logger.log('debug', `Cleanup completed for ${record.id}`, {
recordsRemaining: this.connectionRecords.size
});
Recommendations
-
Immediate Cleanup for Proxy Chains
- Skip batch queue for proxy chain connections
- Use synchronous cleanup when PROXY protocol is detected
-
Socket State Validation
- Check both
destroyed
andreadyState
before cleanup decisions - Handle 'opening' state sockets explicitly
- Check both
-
Timeout Adjustments
- Shorter timeouts for proxy chain connections
- More aggressive cleanup for connections without data transfer
-
Connection Limits
- Per-route connection limits
- Backpressure when approaching limits
-
Monitoring
- Export connection metrics
- Alert on connection count thresholds
- Track connection age distribution
Test Scenarios to Reproduce
-
Rapid Connect/Disconnect
# Create many short-lived connections for i in {1..1000}; do (echo -n | nc localhost 8001) & done
-
Slow Backend
- Configure inner proxy to connect to unresponsive backend
- Monitor outer proxy connection count
-
Mixed Traffic
- Combine TLS and non-TLS connections
- Add keep-alive connections
- Observe accumulation patterns
Future Improvements
-
Connection Pool Isolation
- Separate pools for proxy chain vs direct connections
- Different cleanup strategies per pool
-
Circuit Breaker
- Detect accumulation and trigger aggressive cleanup
- Temporary refuse new connections when near limit
-
Connection State Machine
- Explicit states: CONNECTING, ESTABLISHED, CLOSING, CLOSED
- State transition validation
- Timeout per state
-
Metrics Collection
- Connection lifecycle events
- Cleanup success/failure rates
- Time spent in each state
Root Cause Identified (January 2025)
The primary issue is on the inner proxy when backends are unreachable:
When the backend is unreachable (e.g., non-routable IP like 10.255.255.1):
- The outgoing socket gets stuck in "opening" state indefinitely
- The
createSocketWithErrorHandler
in socket-utils.ts doesn't implement connection timeout socket.setTimeout()
only handles inactivity AFTER connection, not during connect phase- Connections accumulate because they never transition to error state
- Socket timeout warnings fire but connections are preserved as keep-alive
Code Issue:
// socket-utils.ts line 275
if (timeout) {
socket.setTimeout(timeout); // This only handles inactivity, not connection!
}
Required Fix:
- Add
connectionTimeout
to ISmartProxyOptions interface:
// In interfaces.ts
connectionTimeout?: number; // Timeout for establishing connection (ms), default: 30000 (30s)
- Update
createSocketWithErrorHandler
in socket-utils.ts:
export function createSocketWithErrorHandler(options: SafeSocketOptions): plugins.net.Socket {
const { port, host, onError, onConnect, timeout } = options;
const socket = new plugins.net.Socket();
let connected = false;
let connectionTimeout: NodeJS.Timeout | null = null;
socket.on('error', (error) => {
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
if (onError) onError(error);
});
socket.on('connect', () => {
connected = true;
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
if (timeout) socket.setTimeout(timeout); // Set inactivity timeout
if (onConnect) onConnect();
});
// Implement connection establishment timeout
if (timeout) {
connectionTimeout = setTimeout(() => {
if (!connected && !socket.destroyed) {
const error = new Error(`Connection timeout after ${timeout}ms to ${host}:${port}`);
(error as any).code = 'ETIMEDOUT';
socket.destroy();
if (onError) onError(error);
}
}, timeout);
}
socket.connect(port, host);
return socket;
}
- Pass connectionTimeout in route-connection-handler.ts:
const targetSocket = createSocketWithErrorHandler({
port: finalTargetPort,
host: finalTargetHost,
timeout: this.settings.connectionTimeout || 30000, // Connection timeout
onError: (error) => { /* existing */ },
onConnect: async () => { /* existing */ }
});
Investigation Results (January 2025)
Based on extensive testing with debug scripts:
-
Normal Operation: In controlled tests, connections are properly cleaned up:
- Immediate routing cleanup handler properly destroys outgoing connections
- Both outer and inner proxies maintain 0 connections after clients disconnect
- Keep-alive connections are tracked and cleaned up correctly
-
Potential Edge Cases Not Covered by Tests:
- HTTP/2 Connections: May have different lifecycle than HTTP/1.1
- WebSocket Connections: Long-lived upgrade connections might persist
- Partial TLS Handshakes: Connections that start TLS but don't complete
- PROXY Protocol Parse Failures: Malformed headers from untrusted sources
- Connection Pool Reuse: HttpProxy component may maintain its own pools
-
Timing-Sensitive Scenarios:
- Client disconnects exactly when
record.outgoing
is being assigned - Backend connects but immediately RSTs
- Proxy chain where middle proxy restarts
- Multiple rapid reconnects with same source IP/port
- Client disconnects exactly when
-
Configuration-Specific Issues:
- Mixed
sendProxyProtocol
settings in chain - Different
keepAlive
settings between proxies - Mismatched timeout values
- Routes with
forwardingEngine: 'nftables'
- Mixed
Additional Debug Points
Add these debug logs to identify the specific scenario:
// In route-connection-handler.ts setupDirectConnection
logger.log('debug', `Setting outgoing socket for ${connectionId}`, {
timestamp: Date.now(),
hasOutgoing: !!record.outgoing,
socketState: targetSocket.readyState
});
// In connection-manager.ts cleanupConnection
logger.log('debug', `Cleanup attempt for ${record.id}`, {
alreadyClosed: record.connectionClosed,
hasIncoming: !!record.incoming,
hasOutgoing: !!record.outgoing,
incomingDestroyed: record.incoming?.destroyed,
outgoingDestroyed: record.outgoing?.destroyed
});
Workarounds
Until root cause is identified:
-
Periodic Force Cleanup:
setInterval(() => { const connections = connectionManager.getConnections(); for (const [id, record] of connections) { if (record.incoming?.destroyed && !record.connectionClosed) { connectionManager.cleanupConnection(record, 'force_cleanup'); } } }, 60000); // Every minute
-
Connection Age Limit:
// Add max connection age check const maxAge = 3600000; // 1 hour if (Date.now() - record.incomingStartTime > maxAge) { connectionManager.cleanupConnection(record, 'max_age'); }
-
Aggressive Timeout Settings:
{ socketTimeout: 60000, // 1 minute inactivityTimeout: 300000, // 5 minutes connectionCleanupInterval: 30000 // 30 seconds }
Related Files
/ts/proxies/smart-proxy/route-connection-handler.ts
- Main connection handling/ts/proxies/smart-proxy/connection-manager.ts
- Connection tracking and cleanup/ts/core/utils/socket-utils.ts
- Socket cleanup utilities/test/test.proxy-chain-cleanup.node.ts
- Test for connection cleanup/test/test.proxy-chaining-accumulation.node.ts
- Test for accumulation prevention/.nogit/debug/connection-accumulation-debug.ts
- Debug script for connection states/.nogit/debug/connection-accumulation-keepalive.ts
- Keep-alive specific tests/.nogit/debug/connection-accumulation-http.ts
- HTTP traffic through proxy chains
Summary
Issue Identified: Connection accumulation occurs on the inner proxy (not outer) when backends are unreachable.
Root Cause: The createSocketWithErrorHandler
function in socket-utils.ts doesn't implement connection establishment timeout. It only sets socket.setTimeout()
which handles inactivity AFTER connection is established, not during the connect phase.
Impact: When connecting to unreachable IPs (e.g., 10.255.255.1), outgoing sockets remain in "opening" state indefinitely, causing connections to accumulate.
Fix Required:
- Add
connectionTimeout
setting to ISmartProxyOptions - Implement proper connection timeout in
createSocketWithErrorHandler
- Pass the timeout value from route-connection-handler
Workaround Until Fixed: Configure shorter socket timeouts and use the periodic force cleanup suggested above.
The connection cleanup mechanisms have been significantly improved in v19.5.20:
- Race condition fixed by setting
record.outgoing
before connecting - Immediate routing cleanup handler always destroys outgoing connections
- Tests confirm no accumulation in standard scenarios with reachable backends
However, the missing connection establishment timeout causes accumulation when backends are unreachable or very slow to connect.