Compare commits

..

6 Commits

Author SHA1 Message Date
19590ef107 19.5.24
Some checks failed
Default (tags) / security (push) Successful in 32s
Default (tags) / test (push) Failing after 24m57s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-06-07 10:56:08 +00:00
47735adbf2 Implement zombie connection detection and cleanup in ConnectionManager; enhance tests for edge cases 2025-06-07 10:55:59 +00:00
9094b76b1b 19.5.23
Some checks failed
Default (tags) / security (push) Successful in 34s
Default (tags) / test (push) Failing after 24m25s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-06-06 23:36:19 +00:00
9aebcd488d Implement connection timeout handling and improve connection cleanup in SmartProxy 2025-06-06 23:34:50 +00:00
311691c2cc 19.5.22
Some checks failed
Default (tags) / security (push) Successful in 36s
Default (tags) / test (push) Failing after 19m29s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-06-06 15:54:40 +00:00
578d1ba2f7 update 2025-06-06 15:00:46 +00:00
9 changed files with 1187 additions and 23 deletions

View File

@ -1,6 +1,6 @@
{
"name": "@push.rocks/smartproxy",
"version": "19.5.21",
"version": "19.5.24",
"private": false,
"description": "A powerful proxy package with unified route-based configuration for high traffic management. Features include SSL/TLS support, flexible routing patterns, WebSocket handling, advanced security options, and automatic ACME certificate management.",
"main": "dist_ts/index.js",

551
readme.connections.md Normal file
View File

@ -0,0 +1,551 @@
# Connection Management in SmartProxy
This document describes connection handling, cleanup mechanisms, and known issues in SmartProxy, particularly focusing on proxy chain configurations.
## Connection Accumulation Investigation (January 2025)
### Problem Statement
Connections may accumulate on the outer proxy in proxy chain configurations, despite implemented fixes.
### Historical Context
- **v19.5.12-v19.5.15**: Major connection cleanup improvements
- **v19.5.19+**: PROXY protocol support with WrappedSocket implementation
- **v19.5.20**: Fixed race condition in immediate routing cleanup
### Current Architecture
#### Connection Flow in Proxy Chains
```
Client → Outer Proxy (8001) → Inner Proxy (8002) → Backend (httpbin.org:443)
```
1. **Outer Proxy**:
- Accepts client connection
- Sends PROXY protocol header to inner proxy
- Tracks connection in ConnectionManager
- Immediate routing for non-TLS ports
2. **Inner Proxy**:
- Parses PROXY protocol to get real client IP
- Establishes connection to backend
- Tracks its own connections separately
### Potential Causes of Connection Accumulation
#### 1. Race Condition in Immediate Routing
When a connection is immediately routed (non-TLS ports), there's a timing window:
```typescript
// route-connection-handler.ts, line ~231
this.routeConnection(socket, record, '', undefined);
// Connection is routed before all setup is complete
```
**Issue**: If client disconnects during backend connection setup, cleanup may not trigger properly.
#### 2. Outgoing Socket Assignment Timing
Despite the fix in v19.5.20:
```typescript
// Line 1362 in setupDirectConnection
record.outgoing = targetSocket;
```
There's still a window between socket creation and the `connect` event where cleanup might miss the outgoing socket.
#### 3. Batch Cleanup Delays
ConnectionManager uses queued cleanup:
- Batch size: 100 connections
- Batch interval: 100ms
- Under rapid connection/disconnection, queue might lag
#### 4. Different Cleanup Paths
Multiple cleanup triggers exist:
- Socket 'close' event
- Socket 'error' event
- Inactivity timeout
- Connection timeout
- Manual cleanup
Not all paths may properly handle proxy chain scenarios.
#### 5. Keep-Alive Connection Handling
Keep-alive connections have special treatment:
- Extended inactivity timeout (6x normal)
- Warning before closure
- May accumulate if backend is unresponsive
### Observed Symptoms
1. **Outer proxy connection count grows over time**
2. **Inner proxy maintains zero or low connection count**
3. **Connections show as closed in logs but remain in tracking**
4. **Memory usage gradually increases**
### Debug Strategies
#### 1. Enhanced Logging
Add connection state logging at key points:
```typescript
// When outgoing socket is created
logger.log('debug', `Outgoing socket created for ${connectionId}`, {
hasOutgoing: !!record.outgoing,
outgoingState: record.outgoing?.readyState
});
```
#### 2. Connection State Inspection
Periodically log detailed connection state:
```typescript
for (const [id, record] of connectionManager.getConnections()) {
console.log({
id,
age: Date.now() - record.incomingStartTime,
incomingDestroyed: record.incoming.destroyed,
outgoingDestroyed: record.outgoing?.destroyed,
hasCleanupTimer: !!record.cleanupTimer
});
}
```
#### 3. Cleanup Verification
Track cleanup completion:
```typescript
// In cleanupConnection
logger.log('debug', `Cleanup completed for ${record.id}`, {
recordsRemaining: this.connectionRecords.size
});
```
### Recommendations
1. **Immediate Cleanup for Proxy Chains**
- Skip batch queue for proxy chain connections
- Use synchronous cleanup when PROXY protocol is detected
2. **Socket State Validation**
- Check both `destroyed` and `readyState` before cleanup decisions
- Handle 'opening' state sockets explicitly
3. **Timeout Adjustments**
- Shorter timeouts for proxy chain connections
- More aggressive cleanup for connections without data transfer
4. **Connection Limits**
- Per-route connection limits
- Backpressure when approaching limits
5. **Monitoring**
- Export connection metrics
- Alert on connection count thresholds
- Track connection age distribution
### Test Scenarios to Reproduce
1. **Rapid Connect/Disconnect**
```bash
# Create many short-lived connections
for i in {1..1000}; do
(echo -n | nc localhost 8001) &
done
```
2. **Slow Backend**
- Configure inner proxy to connect to unresponsive backend
- Monitor outer proxy connection count
3. **Mixed Traffic**
- Combine TLS and non-TLS connections
- Add keep-alive connections
- Observe accumulation patterns
### Future Improvements
1. **Connection Pool Isolation**
- Separate pools for proxy chain vs direct connections
- Different cleanup strategies per pool
2. **Circuit Breaker**
- Detect accumulation and trigger aggressive cleanup
- Temporary refuse new connections when near limit
3. **Connection State Machine**
- Explicit states: CONNECTING, ESTABLISHED, CLOSING, CLOSED
- State transition validation
- Timeout per state
4. **Metrics Collection**
- Connection lifecycle events
- Cleanup success/failure rates
- Time spent in each state
### Root Cause Identified (January 2025)
**The primary issue is on the inner proxy when backends are unreachable:**
When the backend is unreachable (e.g., non-routable IP like 10.255.255.1):
1. The outgoing socket gets stuck in "opening" state indefinitely
2. The `createSocketWithErrorHandler` in socket-utils.ts doesn't implement connection timeout
3. `socket.setTimeout()` only handles inactivity AFTER connection, not during connect phase
4. Connections accumulate because they never transition to error state
5. Socket timeout warnings fire but connections are preserved as keep-alive
**Code Issue:**
```typescript
// socket-utils.ts line 275
if (timeout) {
socket.setTimeout(timeout); // This only handles inactivity, not connection!
}
```
**Required Fix:**
1. Add `connectionTimeout` to ISmartProxyOptions interface:
```typescript
// In interfaces.ts
connectionTimeout?: number; // Timeout for establishing connection (ms), default: 30000 (30s)
```
2. Update `createSocketWithErrorHandler` in socket-utils.ts:
```typescript
export function createSocketWithErrorHandler(options: SafeSocketOptions): plugins.net.Socket {
const { port, host, onError, onConnect, timeout } = options;
const socket = new plugins.net.Socket();
let connected = false;
let connectionTimeout: NodeJS.Timeout | null = null;
socket.on('error', (error) => {
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
if (onError) onError(error);
});
socket.on('connect', () => {
connected = true;
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
if (timeout) socket.setTimeout(timeout); // Set inactivity timeout
if (onConnect) onConnect();
});
// Implement connection establishment timeout
if (timeout) {
connectionTimeout = setTimeout(() => {
if (!connected && !socket.destroyed) {
const error = new Error(`Connection timeout after ${timeout}ms to ${host}:${port}`);
(error as any).code = 'ETIMEDOUT';
socket.destroy();
if (onError) onError(error);
}
}, timeout);
}
socket.connect(port, host);
return socket;
}
```
3. Pass connectionTimeout in route-connection-handler.ts:
```typescript
const targetSocket = createSocketWithErrorHandler({
port: finalTargetPort,
host: finalTargetHost,
timeout: this.settings.connectionTimeout || 30000, // Connection timeout
onError: (error) => { /* existing */ },
onConnect: async () => { /* existing */ }
});
```
### Investigation Results (January 2025)
Based on extensive testing with debug scripts:
1. **Normal Operation**: In controlled tests, connections are properly cleaned up:
- Immediate routing cleanup handler properly destroys outgoing connections
- Both outer and inner proxies maintain 0 connections after clients disconnect
- Keep-alive connections are tracked and cleaned up correctly
2. **Potential Edge Cases Not Covered by Tests**:
- **HTTP/2 Connections**: May have different lifecycle than HTTP/1.1
- **WebSocket Connections**: Long-lived upgrade connections might persist
- **Partial TLS Handshakes**: Connections that start TLS but don't complete
- **PROXY Protocol Parse Failures**: Malformed headers from untrusted sources
- **Connection Pool Reuse**: HttpProxy component may maintain its own pools
3. **Timing-Sensitive Scenarios**:
- Client disconnects exactly when `record.outgoing` is being assigned
- Backend connects but immediately RSTs
- Proxy chain where middle proxy restarts
- Multiple rapid reconnects with same source IP/port
4. **Configuration-Specific Issues**:
- Mixed `sendProxyProtocol` settings in chain
- Different `keepAlive` settings between proxies
- Mismatched timeout values
- Routes with `forwardingEngine: 'nftables'`
### Additional Debug Points
Add these debug logs to identify the specific scenario:
```typescript
// In route-connection-handler.ts setupDirectConnection
logger.log('debug', `Setting outgoing socket for ${connectionId}`, {
timestamp: Date.now(),
hasOutgoing: !!record.outgoing,
socketState: targetSocket.readyState
});
// In connection-manager.ts cleanupConnection
logger.log('debug', `Cleanup attempt for ${record.id}`, {
alreadyClosed: record.connectionClosed,
hasIncoming: !!record.incoming,
hasOutgoing: !!record.outgoing,
incomingDestroyed: record.incoming?.destroyed,
outgoingDestroyed: record.outgoing?.destroyed
});
```
### Workarounds
Until root cause is identified:
1. **Periodic Force Cleanup**:
```typescript
setInterval(() => {
const connections = connectionManager.getConnections();
for (const [id, record] of connections) {
if (record.incoming?.destroyed && !record.connectionClosed) {
connectionManager.cleanupConnection(record, 'force_cleanup');
}
}
}, 60000); // Every minute
```
2. **Connection Age Limit**:
```typescript
// Add max connection age check
const maxAge = 3600000; // 1 hour
if (Date.now() - record.incomingStartTime > maxAge) {
connectionManager.cleanupConnection(record, 'max_age');
}
```
3. **Aggressive Timeout Settings**:
```typescript
{
socketTimeout: 60000, // 1 minute
inactivityTimeout: 300000, // 5 minutes
connectionCleanupInterval: 30000 // 30 seconds
}
```
### Related Files
- `/ts/proxies/smart-proxy/route-connection-handler.ts` - Main connection handling
- `/ts/proxies/smart-proxy/connection-manager.ts` - Connection tracking and cleanup
- `/ts/core/utils/socket-utils.ts` - Socket cleanup utilities
- `/test/test.proxy-chain-cleanup.node.ts` - Test for connection cleanup
- `/test/test.proxy-chaining-accumulation.node.ts` - Test for accumulation prevention
- `/.nogit/debug/connection-accumulation-debug.ts` - Debug script for connection states
- `/.nogit/debug/connection-accumulation-keepalive.ts` - Keep-alive specific tests
- `/.nogit/debug/connection-accumulation-http.ts` - HTTP traffic through proxy chains
### Summary
**Issue Identified**: Connection accumulation occurs on the **inner proxy** (not outer) when backends are unreachable.
**Root Cause**: The `createSocketWithErrorHandler` function in socket-utils.ts doesn't implement connection establishment timeout. It only sets `socket.setTimeout()` which handles inactivity AFTER connection is established, not during the connect phase.
**Impact**: When connecting to unreachable IPs (e.g., 10.255.255.1), outgoing sockets remain in "opening" state indefinitely, causing connections to accumulate.
**Fix Required**:
1. Add `connectionTimeout` setting to ISmartProxyOptions
2. Implement proper connection timeout in `createSocketWithErrorHandler`
3. Pass the timeout value from route-connection-handler
**Workaround Until Fixed**: Configure shorter socket timeouts and use the periodic force cleanup suggested above.
The connection cleanup mechanisms have been significantly improved in v19.5.20:
1. Race condition fixed by setting `record.outgoing` before connecting
2. Immediate routing cleanup handler always destroys outgoing connections
3. Tests confirm no accumulation in standard scenarios with reachable backends
However, the missing connection establishment timeout causes accumulation when backends are unreachable or very slow to connect.
### Outer Proxy Sudden Accumulation After Hours
**User Report**: "The counter goes up suddenly after some hours on the outer proxy"
**Investigation Findings**:
1. **Cleanup Queue Mechanism**:
- Connections are cleaned up in batches of 100 via a queue
- If the cleanup timer gets stuck or cleared without restart, connections accumulate
- The timer is set with `setTimeout` and could be affected by event loop blocking
2. **Potential Causes for Sudden Spikes**:
a) **Cleanup Timer Failure**:
```typescript
// In ConnectionManager, if this timer gets cleared but not restarted:
this.cleanupTimer = this.setTimeout(() => {
this.processCleanupQueue();
}, 100);
```
b) **Memory Pressure**:
- After hours of operation, memory fragmentation or pressure could cause delays
- Garbage collection pauses might interfere with timer execution
c) **Event Listener Accumulation**:
- Socket event listeners might accumulate over time
- Server 'connection' event handlers are particularly important
d) **Keep-Alive Connection Cascades**:
- When many keep-alive connections timeout simultaneously
- Outer proxy has different timeout than inner proxy
- Mass disconnection events can overwhelm cleanup queue
e) **HttpProxy Component Issues**:
- If using `useHttpProxy`, the HttpProxy bridge might maintain connection pools
- These pools might not be properly cleaned after hours
3. **Why "Sudden" After Hours**:
- Not a gradual leak but triggered by specific conditions
- Likely related to periodic events or thresholds:
- Inactivity check runs every 30 seconds
- Keep-alive connections have extended timeouts (6x normal)
- Parity check has 30-minute timeout for half-closed connections
4. **Reproduction Scenarios**:
- Mass client disconnection/reconnection (network blip)
- Keep-alive timeout cascade when inner proxy times out first
- Cleanup timer getting stuck during high load
- Memory pressure causing event loop delays
### Additional Monitoring Recommendations
1. **Add Cleanup Queue Monitoring**:
```typescript
setInterval(() => {
const cm = proxy.connectionManager;
if (cm.cleanupQueue.size > 100 && !cm.cleanupTimer) {
logger.error('Cleanup queue stuck!', {
queueSize: cm.cleanupQueue.size,
hasTimer: !!cm.cleanupTimer
});
}
}, 60000);
```
2. **Track Timer Health**:
- Monitor if cleanup timer is running
- Check for event loop blocking
- Log when batch processing takes too long
3. **Memory Monitoring**:
- Track heap usage over time
- Monitor for memory leaks in long-running processes
- Force periodic garbage collection if needed
### Immediate Mitigations
1. **Restart Cleanup Timer**:
```typescript
// Emergency cleanup timer restart
if (!cm.cleanupTimer && cm.cleanupQueue.size > 0) {
cm.cleanupTimer = setTimeout(() => {
cm.processCleanupQueue();
}, 100);
}
```
2. **Force Periodic Cleanup**:
```typescript
setInterval(() => {
const cm = connectionManager;
if (cm.getConnectionCount() > threshold) {
cm.performOptimizedInactivityCheck();
// Force process cleanup queue
cm.processCleanupQueue();
}
}, 300000); // Every 5 minutes
```
3. **Connection Age Limits**:
- Set maximum connection lifetime
- Force close connections older than threshold
- More aggressive cleanup for proxy chains
## ✅ FIXED: Zombie Connection Detection (January 2025)
### Root Cause Identified
"Zombie connections" occur when sockets are destroyed without triggering their close/error event handlers. This causes connections to remain tracked with both sockets destroyed but `connectionClosed=false`. This is particularly problematic in proxy chains where the inner proxy might close connections in ways that don't trigger proper events on the outer proxy.
### Fix Implemented
Added zombie detection to the periodic inactivity check in ConnectionManager:
```typescript
// In performOptimizedInactivityCheck()
// Check ALL connections for zombie state
for (const [connectionId, record] of this.connectionRecords) {
if (!record.connectionClosed) {
const incomingDestroyed = record.incoming?.destroyed || false;
const outgoingDestroyed = record.outgoing?.destroyed || false;
// Check for zombie connections: both sockets destroyed but not cleaned up
if (incomingDestroyed && outgoingDestroyed) {
logger.log('warn', `Zombie connection detected: ${connectionId} - both sockets destroyed but not cleaned up`, {
connectionId,
remoteIP: record.remoteIP,
age: plugins.prettyMs(now - record.incomingStartTime),
component: 'connection-manager'
});
// Clean up immediately
this.cleanupConnection(record, 'zombie_cleanup');
continue;
}
// Check for half-zombie: one socket destroyed
if (incomingDestroyed || outgoingDestroyed) {
const age = now - record.incomingStartTime;
// Give it 30 seconds grace period for normal cleanup
if (age > 30000) {
logger.log('warn', `Half-zombie connection detected: ${connectionId} - ${incomingDestroyed ? 'incoming' : 'outgoing'} destroyed`, {
connectionId,
remoteIP: record.remoteIP,
age: plugins.prettyMs(age),
incomingDestroyed,
outgoingDestroyed,
component: 'connection-manager'
});
// Clean up
this.cleanupConnection(record, 'half_zombie_cleanup');
}
}
}
}
```
### How It Works
1. **Full Zombie Detection**: Detects when both incoming and outgoing sockets are destroyed but the connection hasn't been cleaned up
2. **Half-Zombie Detection**: Detects when only one socket is destroyed, with a 30-second grace period for normal cleanup to occur
3. **Automatic Cleanup**: Immediately cleans up zombie connections when detected
4. **Runs Periodically**: Integrated into the existing inactivity check that runs every 30 seconds
### Why This Fixes the Outer Proxy Accumulation
- When inner proxy closes connections abruptly (e.g., due to backend failure), the outer proxy's outgoing socket might be destroyed without firing close/error events
- These become zombie connections that previously accumulated indefinitely
- Now they are detected and cleaned up within 30 seconds
### Test Results
Debug scripts confirmed:
- Zombie connections can be created when sockets are destroyed directly without events
- The zombie detection successfully identifies and cleans up these connections
- Both full zombies (both sockets destroyed) and half-zombies (one socket destroyed) are handled
This fix addresses the specific issue where "connections that are closed on the inner proxy, always also close on the outer proxy" as requested by the user.

View File

@ -856,4 +856,42 @@ The WrappedSocket class has been implemented as the foundation for PROXY protoco
For detailed information about proxy protocol implementation and proxy chaining:
- **[Proxy Protocol Guide](./readme.proxy-protocol.md)** - Complete implementation details and configuration
- **[Proxy Protocol Examples](./readme.proxy-protocol-example.md)** - Code examples and conceptual implementation
- **[Proxy Chain Summary](./readme.proxy-chain-summary.md)** - Quick reference for proxy chaining setup
- **[Proxy Chain Summary](./readme.proxy-chain-summary.md)** - Quick reference for proxy chaining setup
## Connection Cleanup Edge Cases Investigation (v19.5.20+)
### Issue Discovered
"Zombie connections" can occur when both sockets are destroyed but the connection record hasn't been cleaned up. This happens when sockets are destroyed without triggering their close/error event handlers.
### Root Cause
1. **Event Handler Bypass**: In edge cases (network failures, proxy chain failures, forced socket destruction), sockets can be destroyed without their event handlers being called
2. **Cleanup Queue Delay**: The `initiateCleanupOnce` method adds connections to a cleanup queue (batch of 100 every 100ms), which may not process fast enough
3. **Inactivity Check Limitation**: The periodic inactivity check only examines `lastActivity` timestamps, not actual socket states
### Test Results
Debug script (`connection-manager-direct-test.ts`) revealed:
- **Normal cleanup works**: When socket events fire normally, cleanup is reliable
- **Zombies ARE created**: Direct socket destruction creates zombies (destroyed sockets, connectionClosed=false)
- **Manual cleanup works**: Calling `initiateCleanupOnce` on a zombie does clean it up
- **Inactivity check misses zombies**: The check doesn't detect connections with destroyed sockets
### Potential Solutions
1. **Periodic Zombie Detection**: Add zombie detection to the inactivity check:
```typescript
// In performOptimizedInactivityCheck
if (record.incoming?.destroyed && record.outgoing?.destroyed && !record.connectionClosed) {
this.cleanupConnection(record, 'zombie_detected');
}
```
2. **Socket State Monitoring**: Check socket states during connection operations
3. **Defensive Socket Handling**: Always attach cleanup handlers before any operation that might destroy sockets
4. **Immediate Cleanup Option**: For critical paths, use `cleanupConnection` instead of `initiateCleanupOnce`
### Impact
- Memory leaks in edge cases (network failures, proxy chain issues)
- Connection count inaccuracy
- Potential resource exhaustion over time
### Test Files
- `.nogit/debug/connection-manager-direct-test.ts` - Direct ConnectionManager testing showing zombie creation

View File

@ -0,0 +1,182 @@
import { expect, tap } from '@git.zone/tstest/tapbundle';
import * as plugins from '../ts/plugins.js';
import { SmartProxy } from '../ts/index.js';
let outerProxy: SmartProxy;
let innerProxy: SmartProxy;
tap.test('setup two smartproxies in a chain configuration', async () => {
// Setup inner proxy (backend proxy)
innerProxy = new SmartProxy({
routes: [
{
match: {
ports: 8002
},
action: {
type: 'forward',
target: {
host: 'httpbin.org',
port: 443
}
}
}
],
defaults: {
target: {
host: 'httpbin.org',
port: 443
}
},
acceptProxyProtocol: true,
sendProxyProtocol: false,
enableDetailedLogging: true,
connectionCleanupInterval: 5000, // More frequent cleanup for testing
inactivityTimeout: 10000 // Shorter timeout for testing
});
await innerProxy.start();
// Setup outer proxy (frontend proxy)
outerProxy = new SmartProxy({
routes: [
{
match: {
ports: 8001
},
action: {
type: 'forward',
target: {
host: 'localhost',
port: 8002
},
sendProxyProtocol: true
}
}
],
defaults: {
target: {
host: 'localhost',
port: 8002
}
},
sendProxyProtocol: true,
enableDetailedLogging: true,
connectionCleanupInterval: 5000, // More frequent cleanup for testing
inactivityTimeout: 10000 // Shorter timeout for testing
});
await outerProxy.start();
});
tap.test('should properly cleanup connections in proxy chain', async (tools) => {
const testDuration = 30000; // 30 seconds
const connectionInterval = 500; // Create new connection every 500ms
const connectionDuration = 2000; // Each connection lasts 2 seconds
let connectionsCreated = 0;
let connectionsCompleted = 0;
// Function to create a test connection
const createTestConnection = async () => {
connectionsCreated++;
const connectionId = connectionsCreated;
try {
const socket = plugins.net.connect({
port: 8001,
host: 'localhost'
});
await new Promise<void>((resolve, reject) => {
socket.on('connect', () => {
console.log(`Connection ${connectionId} established`);
// Send TLS Client Hello for httpbin.org
const clientHello = Buffer.from([
0x16, 0x03, 0x01, 0x00, 0xc8, // TLS handshake header
0x01, 0x00, 0x00, 0xc4, // Client Hello
0x03, 0x03, // TLS 1.2
...Array(32).fill(0), // Random bytes
0x00, // Session ID length
0x00, 0x02, 0x13, 0x01, // Cipher suites
0x01, 0x00, // Compression methods
0x00, 0x97, // Extensions length
0x00, 0x00, 0x00, 0x0f, 0x00, 0x0d, // SNI extension
0x00, 0x00, 0x0a, 0x68, 0x74, 0x74, 0x70, 0x62, 0x69, 0x6e, 0x2e, 0x6f, 0x72, 0x67 // "httpbin.org"
]);
socket.write(clientHello);
// Keep connection alive for specified duration
setTimeout(() => {
socket.destroy();
connectionsCompleted++;
console.log(`Connection ${connectionId} closed (completed: ${connectionsCompleted}/${connectionsCreated})`);
resolve();
}, connectionDuration);
});
socket.on('error', (err) => {
console.log(`Connection ${connectionId} error: ${err.message}`);
connectionsCompleted++;
reject(err);
});
});
} catch (err) {
console.log(`Failed to create connection ${connectionId}: ${err.message}`);
connectionsCompleted++;
}
};
// Start creating connections
const startTime = Date.now();
const connectionTimer = setInterval(() => {
if (Date.now() - startTime < testDuration) {
createTestConnection().catch(() => {});
} else {
clearInterval(connectionTimer);
}
}, connectionInterval);
// Monitor connection counts
const monitorInterval = setInterval(() => {
const outerConnections = (outerProxy as any).connectionManager.getConnectionCount();
const innerConnections = (innerProxy as any).connectionManager.getConnectionCount();
console.log(`Active connections - Outer: ${outerConnections}, Inner: ${innerConnections}, Created: ${connectionsCreated}, Completed: ${connectionsCompleted}`);
}, 2000);
// Wait for test duration + cleanup time
await tools.delayFor(testDuration + 10000);
clearInterval(connectionTimer);
clearInterval(monitorInterval);
// Wait for all connections to complete
while (connectionsCompleted < connectionsCreated) {
await tools.delayFor(100);
}
// Give some time for cleanup
await tools.delayFor(5000);
// Check final connection counts
const finalOuterConnections = (outerProxy as any).connectionManager.getConnectionCount();
const finalInnerConnections = (innerProxy as any).connectionManager.getConnectionCount();
console.log(`\nFinal connection counts:`);
console.log(`Outer proxy: ${finalOuterConnections}`);
console.log(`Inner proxy: ${finalInnerConnections}`);
console.log(`Total created: ${connectionsCreated}`);
console.log(`Total completed: ${connectionsCompleted}`);
// Both proxies should have cleaned up all connections
expect(finalOuterConnections).toEqual(0);
expect(finalInnerConnections).toEqual(0);
});
tap.test('cleanup proxies', async () => {
await outerProxy.stop();
await innerProxy.stop();
});
export default tap.start();

View File

@ -0,0 +1,306 @@
import { tap, expect } from '@git.zone/tstest/tapbundle';
import * as net from 'net';
import * as plugins from '../ts/plugins.js';
// Import SmartProxy
import { SmartProxy } from '../ts/index.js';
// Import types through type-only imports
import type { ConnectionManager } from '../ts/proxies/smart-proxy/connection-manager.js';
import type { IConnectionRecord } from '../ts/proxies/smart-proxy/models/interfaces.js';
tap.test('zombie connection cleanup - verify inactivity check detects and cleans destroyed sockets', async () => {
console.log('\n=== Zombie Connection Cleanup Test ===');
console.log('Purpose: Verify that connections with destroyed sockets are detected and cleaned up');
console.log('Setup: Client → OuterProxy (8590) → InnerProxy (8591) → Backend (9998)');
// Create backend server that can be controlled
let acceptConnections = true;
let destroyImmediately = false;
const backendConnections: net.Socket[] = [];
const backend = net.createServer((socket) => {
console.log('Backend: Connection received');
backendConnections.push(socket);
if (destroyImmediately) {
console.log('Backend: Destroying connection immediately');
socket.destroy();
} else {
socket.on('data', (data) => {
console.log('Backend: Received data, echoing back');
socket.write(data);
});
}
});
await new Promise<void>((resolve) => {
backend.listen(9998, () => {
console.log('✓ Backend server started on port 9998');
resolve();
});
});
// Create InnerProxy with faster inactivity check for testing
const innerProxy = new SmartProxy({
ports: [8591],
enableDetailedLogging: true,
inactivityTimeout: 5000, // 5 seconds for faster testing
inactivityCheckInterval: 1000, // Check every second
routes: [{
name: 'to-backend',
match: { ports: 8591 },
action: {
type: 'forward',
target: {
host: 'localhost',
port: 9998
}
}
}]
});
// Create OuterProxy with faster inactivity check
const outerProxy = new SmartProxy({
ports: [8590],
enableDetailedLogging: true,
inactivityTimeout: 5000, // 5 seconds for faster testing
inactivityCheckInterval: 1000, // Check every second
routes: [{
name: 'to-inner',
match: { ports: 8590 },
action: {
type: 'forward',
target: {
host: 'localhost',
port: 8591
}
}
}]
});
await innerProxy.start();
console.log('✓ InnerProxy started on port 8591');
await outerProxy.start();
console.log('✓ OuterProxy started on port 8590');
// Helper to get connection details
const getConnectionDetails = () => {
const outerConnMgr = (outerProxy as any).connectionManager as ConnectionManager;
const innerConnMgr = (innerProxy as any).connectionManager as ConnectionManager;
const outerRecords = Array.from((outerConnMgr as any).connectionRecords.values()) as IConnectionRecord[];
const innerRecords = Array.from((innerConnMgr as any).connectionRecords.values()) as IConnectionRecord[];
return {
outer: {
count: outerConnMgr.getConnectionCount(),
records: outerRecords,
zombies: outerRecords.filter(r =>
!r.connectionClosed &&
r.incoming?.destroyed &&
(r.outgoing?.destroyed ?? true)
),
halfZombies: outerRecords.filter(r =>
!r.connectionClosed &&
(r.incoming?.destroyed || r.outgoing?.destroyed) &&
!(r.incoming?.destroyed && (r.outgoing?.destroyed ?? true))
)
},
inner: {
count: innerConnMgr.getConnectionCount(),
records: innerRecords,
zombies: innerRecords.filter(r =>
!r.connectionClosed &&
r.incoming?.destroyed &&
(r.outgoing?.destroyed ?? true)
),
halfZombies: innerRecords.filter(r =>
!r.connectionClosed &&
(r.incoming?.destroyed || r.outgoing?.destroyed) &&
!(r.incoming?.destroyed && (r.outgoing?.destroyed ?? true))
)
}
};
};
console.log('\n--- Test 1: Create zombie by destroying sockets without events ---');
// Create a connection and forcefully destroy sockets to create zombies
const client1 = new net.Socket();
await new Promise<void>((resolve) => {
client1.connect(8590, 'localhost', () => {
console.log('Client1 connected to OuterProxy');
client1.write('GET / HTTP/1.1\r\nHost: test.com\r\n\r\n');
// Wait for connection to be established through the chain
setTimeout(() => {
console.log('Forcefully destroying backend connections to create zombies');
// Get connection details before destruction
const beforeDetails = getConnectionDetails();
console.log(`Before destruction: Outer=${beforeDetails.outer.count}, Inner=${beforeDetails.inner.count}`);
// Destroy all backend connections without proper close events
backendConnections.forEach(conn => {
if (!conn.destroyed) {
// Remove all listeners to prevent proper cleanup
conn.removeAllListeners();
conn.destroy();
}
});
// Also destroy the client socket abruptly
client1.removeAllListeners();
client1.destroy();
resolve();
}, 500);
});
});
// Check immediately after destruction
await new Promise(resolve => setTimeout(resolve, 100));
let details = getConnectionDetails();
console.log(`\nAfter destruction:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
// Wait for inactivity check to run (should detect zombies)
console.log('\nWaiting for inactivity check to detect zombies...');
await new Promise(resolve => setTimeout(resolve, 2000));
details = getConnectionDetails();
console.log(`\nAfter first inactivity check:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
console.log('\n--- Test 2: Create half-zombie by destroying only one socket ---');
// Clear backend connections array
backendConnections.length = 0;
const client2 = new net.Socket();
await new Promise<void>((resolve) => {
client2.connect(8590, 'localhost', () => {
console.log('Client2 connected to OuterProxy');
client2.write('GET / HTTP/1.1\r\nHost: test.com\r\n\r\n');
setTimeout(() => {
console.log('Creating half-zombie by destroying only outgoing socket on outer proxy');
// Access the connection records directly
const outerConnMgr = (outerProxy as any).connectionManager as ConnectionManager;
const outerRecords = Array.from((outerConnMgr as any).connectionRecords.values()) as IConnectionRecord[];
// Find the active connection and destroy only its outgoing socket
const activeRecord = outerRecords.find(r => !r.connectionClosed && r.outgoing && !r.outgoing.destroyed);
if (activeRecord && activeRecord.outgoing) {
console.log('Found active connection, destroying outgoing socket');
activeRecord.outgoing.removeAllListeners();
activeRecord.outgoing.destroy();
}
resolve();
}, 500);
});
});
// Check half-zombie state
await new Promise(resolve => setTimeout(resolve, 100));
details = getConnectionDetails();
console.log(`\nAfter creating half-zombie:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
// Wait for 30-second grace period (simulated by multiple checks)
console.log('\nWaiting for half-zombie grace period (30 seconds simulated)...');
// Manually age the connection to trigger half-zombie cleanup
const outerConnMgr = (outerProxy as any).connectionManager as ConnectionManager;
const records = Array.from((outerConnMgr as any).connectionRecords.values()) as IConnectionRecord[];
records.forEach(record => {
if (!record.connectionClosed) {
// Age the connection by 35 seconds
record.incomingStartTime -= 35000;
}
});
// Trigger inactivity check
await new Promise(resolve => setTimeout(resolve, 2000));
details = getConnectionDetails();
console.log(`\nAfter half-zombie cleanup:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
// Clean up client2 properly
if (!client2.destroyed) {
client2.destroy();
}
console.log('\n--- Test 3: Rapid zombie creation under load ---');
// Create multiple connections rapidly and destroy them
const rapidClients: net.Socket[] = [];
for (let i = 0; i < 5; i++) {
const client = new net.Socket();
rapidClients.push(client);
client.connect(8590, 'localhost', () => {
console.log(`Rapid client ${i} connected`);
client.write('GET / HTTP/1.1\r\nHost: test.com\r\n\r\n');
// Destroy after random delay
setTimeout(() => {
client.removeAllListeners();
client.destroy();
}, Math.random() * 500);
});
// Small delay between connections
await new Promise(resolve => setTimeout(resolve, 50));
}
// Wait a bit
await new Promise(resolve => setTimeout(resolve, 1000));
details = getConnectionDetails();
console.log(`\nAfter rapid connections:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
// Wait for cleanup
console.log('\nWaiting for final cleanup...');
await new Promise(resolve => setTimeout(resolve, 3000));
details = getConnectionDetails();
console.log(`\nFinal state:`);
console.log(` Outer: ${details.outer.count} connections, ${details.outer.zombies.length} zombies, ${details.outer.halfZombies.length} half-zombies`);
console.log(` Inner: ${details.inner.count} connections, ${details.inner.zombies.length} zombies, ${details.inner.halfZombies.length} half-zombies`);
// Cleanup
await outerProxy.stop();
await innerProxy.stop();
backend.close();
// Verify all connections are cleaned up
console.log('\n--- Verification ---');
if (details.outer.count === 0 && details.inner.count === 0) {
console.log('✅ PASS: All zombie connections were cleaned up');
} else {
console.log('❌ FAIL: Some connections remain');
}
expect(details.outer.count).toEqual(0);
expect(details.inner.count).toEqual(0);
expect(details.outer.zombies.length).toEqual(0);
expect(details.inner.zombies.length).toEqual(0);
expect(details.outer.halfZombies.length).toEqual(0);
expect(details.inner.halfZombies.length).toEqual(0);
});
tap.start();

View File

@ -258,22 +258,61 @@ export function createSocketWithErrorHandler(options: SafeSocketOptions): plugin
// Create socket with immediate error handler attachment
const socket = new plugins.net.Socket();
// Track if connected
let connected = false;
let connectionTimeout: NodeJS.Timeout | null = null;
// Attach error handler BEFORE connecting to catch immediate errors
socket.on('error', (error) => {
console.error(`Socket connection error to ${host}:${port}: ${error.message}`);
// Clear the connection timeout if it exists
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
if (onError) {
onError(error);
}
});
// Attach connect handler if provided
if (onConnect) {
socket.on('connect', onConnect);
}
// Attach connect handler
const handleConnect = () => {
connected = true;
// Clear the connection timeout
if (connectionTimeout) {
clearTimeout(connectionTimeout);
connectionTimeout = null;
}
// Set inactivity timeout if provided (after connection is established)
if (timeout) {
socket.setTimeout(timeout);
}
if (onConnect) {
onConnect();
}
};
// Set timeout if provided
socket.on('connect', handleConnect);
// Implement connection establishment timeout
if (timeout) {
socket.setTimeout(timeout);
connectionTimeout = setTimeout(() => {
if (!connected && !socket.destroyed) {
// Connection timed out - destroy the socket
const error = new Error(`Connection timeout after ${timeout}ms to ${host}:${port}`);
(error as any).code = 'ETIMEDOUT';
console.error(`Socket connection timeout to ${host}:${port} after ${timeout}ms`);
// Destroy the socket
socket.destroy();
// Call error handler
if (onError) {
onError(error);
}
}
}, timeout);
}
// Now attempt to connect - any immediate errors will be caught

View File

@ -456,6 +456,48 @@ export class ConnectionManager extends LifecycleComponent {
}
}
// Also check ALL connections for zombie state (destroyed sockets but not cleaned up)
// This is critical for proxy chains where sockets can be destroyed without events
for (const [connectionId, record] of this.connectionRecords) {
if (!record.connectionClosed) {
const incomingDestroyed = record.incoming?.destroyed || false;
const outgoingDestroyed = record.outgoing?.destroyed || false;
// Check for zombie connections: both sockets destroyed but connection not cleaned up
if (incomingDestroyed && outgoingDestroyed) {
logger.log('warn', `Zombie connection detected: ${connectionId} - both sockets destroyed but not cleaned up`, {
connectionId,
remoteIP: record.remoteIP,
age: plugins.prettyMs(now - record.incomingStartTime),
component: 'connection-manager'
});
// Clean up immediately
this.cleanupConnection(record, 'zombie_cleanup');
continue;
}
// Check for half-zombie: one socket destroyed
if (incomingDestroyed || outgoingDestroyed) {
const age = now - record.incomingStartTime;
// Give it 30 seconds grace period for normal cleanup
if (age > 30000) {
logger.log('warn', `Half-zombie connection detected: ${connectionId} - ${incomingDestroyed ? 'incoming' : 'outgoing'} destroyed`, {
connectionId,
remoteIP: record.remoteIP,
age: plugins.prettyMs(age),
incomingDestroyed,
outgoingDestroyed,
component: 'connection-manager'
});
// Clean up
this.cleanupConnection(record, 'half_zombie_cleanup');
}
}
}
}
// Process only connections that need checking
for (const connectionId of connectionsToCheck) {
const record = this.connectionRecords.get(connectionId);

View File

@ -69,6 +69,7 @@ export interface ISmartProxyOptions {
maxVersion?: string;
// Timeout settings
connectionTimeout?: number; // Timeout for establishing connection to backend (ms), default: 30000 (30s)
initialDataTimeout?: number; // Timeout for initial data/SNI (ms), default: 60000 (60s)
socketTimeout?: number; // Socket inactivity timeout (ms), default: 3600000 (1h)
inactivityCheckInterval?: number; // How often to check for inactive connections (ms), default: 60000 (60s)

View File

@ -199,25 +199,29 @@ export class RouteConnectionHandler {
setupSocketHandlers(
underlyingSocket,
(reason) => {
// Only cleanup if connection hasn't been fully established
// Check if outgoing connection exists and is connected
if (!record.outgoing || record.outgoing.readyState !== 'open') {
logger.log('debug', `Connection ${connectionId} closed during immediate routing: ${reason}`, {
// Always cleanup when incoming socket closes
// This prevents connection accumulation in proxy chains
logger.log('debug', `Connection ${connectionId} closed during immediate routing: ${reason}`, {
connectionId,
remoteIP: record.remoteIP,
reason,
hasOutgoing: !!record.outgoing,
outgoingState: record.outgoing?.readyState,
component: 'route-handler'
});
// If there's a pending or established outgoing connection, destroy it
if (record.outgoing && !record.outgoing.destroyed) {
logger.log('debug', `Destroying outgoing connection for ${connectionId}`, {
connectionId,
remoteIP: record.remoteIP,
reason,
hasOutgoing: !!record.outgoing,
outgoingState: record.outgoing?.readyState,
outgoingState: record.outgoing.readyState,
component: 'route-handler'
});
// If there's a pending outgoing connection, destroy it
if (record.outgoing && !record.outgoing.destroyed) {
record.outgoing.destroy();
}
this.connectionManager.cleanupConnection(record, reason);
record.outgoing.destroy();
}
// Always cleanup the connection record
this.connectionManager.cleanupConnection(record, reason);
},
undefined, // Use default timeout handler
'immediate-route-client'
@ -1121,6 +1125,7 @@ export class RouteConnectionHandler {
const targetSocket = createSocketWithErrorHandler({
port: finalTargetPort,
host: finalTargetHost,
timeout: this.settings.connectionTimeout || 30000, // Connection timeout (default: 30s)
onError: (error) => {
// Connection failed - clean up everything immediately
// Check if connection record is still valid (client might have disconnected)