Enhance connection cleanup and error handling in RouteConnectionHandler
- Implement immediate cleanup for connection failures to prevent leaks - Add NFTables cleanup on socket close to manage memory usage - Fix connection limit bypass by checking record after creation - Introduce tests for rapid connection retries and routing failures
This commit is contained in:
@ -464,4 +464,63 @@ The fix was applied in two places:
|
||||
1. **ForwardingHandler classes** (`https-passthrough-handler.ts`, etc.) - These are standalone forwarding utilities
|
||||
2. **SmartProxy route-connection-handler** (`route-connection-handler.ts`) - This is where the actual SmartProxy connection handling happens
|
||||
|
||||
The critical fix for SmartProxy was in `setupDirectConnection()` method in route-connection-handler.ts, which now uses `createSocketWithErrorHandler()` to properly handle connection failures and clean up connection records.
|
||||
The critical fix for SmartProxy was in `setupDirectConnection()` method in route-connection-handler.ts, which now uses `createSocketWithErrorHandler()` to properly handle connection failures and clean up connection records.
|
||||
|
||||
## Connection Cleanup Improvements (v19.5.12+)
|
||||
|
||||
### Issue
|
||||
Connections were still counting up during rapid retry scenarios, especially when routing failed or backend connections were refused. This was due to:
|
||||
1. **Delayed Cleanup**: Using `initiateCleanupOnce` queued cleanup operations (batch of 100 every 100ms) instead of immediate cleanup
|
||||
2. **NFTables Memory Leak**: NFTables connections were never cleaned up, staying in memory forever
|
||||
3. **Connection Limit Bypass**: When max connections reached, connection record check happened after creation
|
||||
|
||||
### Root Cause Analysis
|
||||
1. **Queued vs Immediate Cleanup**:
|
||||
- `initiateCleanupOnce()`: Adds to cleanup queue, processes up to 100 connections every 100ms
|
||||
- `cleanupConnection()`: Immediate synchronous cleanup
|
||||
- Under rapid retries, connections were created faster than the queue could process them
|
||||
|
||||
2. **NFTables Connections**:
|
||||
- Marked with `usingNetworkProxy = true` but never cleaned up
|
||||
- Connection records stayed in memory indefinitely
|
||||
|
||||
3. **Error Path Cleanup**:
|
||||
- Many error paths used `socket.end()` (async) followed by cleanup
|
||||
- Created timing windows where connections weren't fully cleaned
|
||||
|
||||
### Solution
|
||||
1. **Immediate Cleanup**: Changed all error paths from `initiateCleanupOnce()` to `cleanupConnection()` for immediate cleanup
|
||||
2. **NFTables Cleanup**: Added socket close listener to clean up connection records when NFTables connections close
|
||||
3. **Connection Limit Fix**: Added null check after `createConnection()` to handle rejection properly
|
||||
|
||||
### Changes Made in route-connection-handler.ts
|
||||
```typescript
|
||||
// 1. NFTables cleanup (line 551-553)
|
||||
socket.once('close', () => {
|
||||
this.connectionManager.cleanupConnection(record, 'nftables_closed');
|
||||
});
|
||||
|
||||
// 2. Connection limit check (line 93-96)
|
||||
const record = this.connectionManager.createConnection(socket);
|
||||
if (!record) {
|
||||
// Connection was rejected due to limit - socket already destroyed
|
||||
return;
|
||||
}
|
||||
|
||||
// 3. Changed all error paths to use immediate cleanup
|
||||
// Before: this.connectionManager.initiateCleanupOnce(record, reason)
|
||||
// After: this.connectionManager.cleanupConnection(record, reason)
|
||||
```
|
||||
|
||||
### Test Coverage
|
||||
- `test/test.rapid-retry-cleanup.node.ts` - Verifies connection cleanup under rapid retry scenarios
|
||||
- Test shows connection count stays at 0 even with 20 rapid retries with 50ms intervals
|
||||
- Confirms both ECONNREFUSED and routing failure scenarios are handled correctly
|
||||
|
||||
### Performance Impact
|
||||
- **Positive**: No more connection accumulation under load
|
||||
- **Positive**: Immediate cleanup reduces memory usage
|
||||
- **Consideration**: More frequent cleanup operations, but prevents queue backlog
|
||||
|
||||
### Migration Notes
|
||||
No configuration changes needed. The improvements are automatic and backward compatible.
|
Reference in New Issue
Block a user