585 lines
23 KiB
Markdown
585 lines
23 KiB
Markdown
# SmartProxy Project Hints
|
||
|
||
## Project Overview
|
||
- Package: `@push.rocks/smartproxy` – high-performance proxy supporting HTTP(S), TCP, WebSocket, and ACME integration.
|
||
- Written in TypeScript, compiled output in `dist_ts/`, uses ESM with NodeNext resolution.
|
||
|
||
## Important: ACME Configuration in v19.0.0
|
||
- **Breaking Change**: ACME configuration must be placed within individual route TLS settings, not at the top level
|
||
- Route-level ACME config is the ONLY way to enable SmartAcme initialization
|
||
- SmartCertManager requires email in route config for certificate acquisition
|
||
- Top-level ACME configuration is ignored in v19.0.0
|
||
|
||
## Repository Structure
|
||
- `ts/` – TypeScript source files:
|
||
- `index.ts` exports main modules.
|
||
- `plugins.ts` centralizes native and third-party imports.
|
||
- Subdirectories: `networkproxy/`, `nftablesproxy/`, `port80handler/`, `redirect/`, `smartproxy/`.
|
||
- Key classes: `ProxyRouter` (`classes.router.ts`), `SmartProxy` (`classes.smartproxy.ts`), plus handlers/managers.
|
||
- `dist_ts/` – transpiled `.js` and `.d.ts` files mirroring `ts/` structure.
|
||
- `test/` – test suites in TypeScript:
|
||
- `test.router.ts` – routing logic (hostname matching, wildcards, path parameters, config management).
|
||
- `test.smartproxy.ts` – proxy behavior tests (TCP forwarding, SNI handling, concurrency, chaining, timeouts).
|
||
- `test/helpers/` – utilities (e.g., certificates).
|
||
- `assets/certs/` – placeholder certificates for ACME and TLS.
|
||
|
||
## Development Setup
|
||
- Requires `pnpm` (v10+).
|
||
- Install dependencies: `pnpm install`.
|
||
- Build: `pnpm build` (runs `tsbuild --web --allowimplicitany`).
|
||
- Test: `pnpm test` (runs `tstest test/`).
|
||
- Format: `pnpm format` (runs `gitzone format`).
|
||
|
||
## How to Test
|
||
|
||
### Test Structure
|
||
Tests use tapbundle from `@git.zone/tstest`. The correct pattern is:
|
||
|
||
```typescript
|
||
import { tap, expect } from '@git.zone/tstest/tapbundle';
|
||
|
||
tap.test('test description', async () => {
|
||
// Test logic here
|
||
expect(someValue).toEqual(expectedValue);
|
||
});
|
||
|
||
// IMPORTANT: Must end with tap.start()
|
||
tap.start();
|
||
```
|
||
|
||
### Expect Syntax (from @push.rocks/smartexpect)
|
||
```typescript
|
||
// Type assertions
|
||
expect('hello').toBeTypeofString();
|
||
expect(42).toBeTypeofNumber();
|
||
|
||
// Equality
|
||
expect('hithere').toEqual('hithere');
|
||
|
||
// Negated assertions
|
||
expect(1).not.toBeTypeofString();
|
||
|
||
// Regular expressions
|
||
expect('hithere').toMatch(/hi/);
|
||
|
||
// Numeric comparisons
|
||
expect(5).toBeGreaterThan(3);
|
||
expect(0.1 + 0.2).toBeCloseTo(0.3, 10);
|
||
|
||
// Arrays
|
||
expect([1, 2, 3]).toContain(2);
|
||
expect([1, 2, 3]).toHaveLength(3);
|
||
|
||
// Async assertions
|
||
await expect(asyncFunction()).resolves.toEqual('expected');
|
||
await expect(asyncFunction()).resolves.withTimeout(5000).toBeTypeofString();
|
||
|
||
// Complex object navigation
|
||
expect(complexObject)
|
||
.property('users')
|
||
.arrayItem(0)
|
||
.property('name')
|
||
.toEqual('Alice');
|
||
```
|
||
|
||
### Test Modifiers
|
||
- `tap.only.test()` - Run only this test
|
||
- `tap.skip.test()` - Skip a test
|
||
- `tap.timeout()` - Set test-specific timeout
|
||
|
||
### Running Tests
|
||
- All tests: `pnpm test`
|
||
- Specific test: `tsx test/test.router.ts`
|
||
- With options: `tstest test/**/*.ts --verbose --timeout 60`
|
||
|
||
### Test File Requirements
|
||
- Must start with `test.` prefix
|
||
- Must use `.ts` extension
|
||
- Must call `tap.start()` at the end
|
||
|
||
## Coding Conventions
|
||
- Import modules via `plugins.ts`:
|
||
```ts
|
||
import * as plugins from './plugins.ts';
|
||
const server = new plugins.http.Server();
|
||
```
|
||
- Reference plugins with full path: `plugins.acme`, `plugins.smartdelay`, `plugins.minimatch`, etc.
|
||
- Path patterns support globs (`*`) and parameters (`:param`) in `ProxyRouter`.
|
||
- Wildcard hostname matching leverages `minimatch` patterns.
|
||
|
||
## Key Components
|
||
- **ProxyRouter**
|
||
- Methods: `routeReq`, `routeReqWithDetails`.
|
||
- Hostname matching: case-insensitive, strips port, supports exact, wildcard, TLD, complex patterns.
|
||
- Path routing: exact, wildcard, parameter extraction (`pathParams`), returns `pathMatch` and `pathRemainder`.
|
||
- Config API: `setNewProxyConfigs`, `addProxyConfig`, `removeProxyConfig`, `getHostnames`, `getProxyConfigs`.
|
||
- **SmartProxy**
|
||
- Manages one or more `net.Server` instances to forward TCP streams.
|
||
- Options: `preserveSourceIP`, `defaultAllowedIPs`, `globalPortRanges`, `sniEnabled`.
|
||
- DomainConfigManager: round-robin selection for multiple target IPs.
|
||
- Graceful shutdown in `stop()`, ensures no lingering servers or sockets.
|
||
|
||
## Notable Points
|
||
- **TSConfig**: `module: NodeNext`, `verbatimModuleSyntax`, allows `.js` extension imports in TS.
|
||
- Mermaid diagrams and architecture flows in `readme.md` illustrate component interactions and protocol flows.
|
||
- CLI entrypoint (`cli.js`) supports command-line usage (ACME, proxy controls).
|
||
- ACME and certificate handling via `Port80Handler` and `helpers.certificates.ts`.
|
||
|
||
## ACME/Certificate Configuration Example (v19.0.0)
|
||
```typescript
|
||
const proxy = new SmartProxy({
|
||
routes: [{
|
||
name: 'example.com',
|
||
match: { domains: 'example.com', ports: 443 },
|
||
action: {
|
||
type: 'forward',
|
||
target: { host: 'localhost', port: 8080 },
|
||
tls: {
|
||
mode: 'terminate',
|
||
certificate: 'auto',
|
||
acme: { // ACME config MUST be here, not at top level
|
||
email: 'ssl@example.com',
|
||
useProduction: false,
|
||
challengePort: 80
|
||
}
|
||
}
|
||
}
|
||
}]
|
||
});
|
||
```
|
||
|
||
## TODOs / Considerations
|
||
- Ensure import extensions in source match build outputs (`.ts` vs `.js`).
|
||
- Update `plugins.ts` when adding new dependencies.
|
||
- Maintain test coverage for new routing or proxy features.
|
||
- Keep `ts/` and `dist_ts/` in sync after refactors.
|
||
- Consider implementing top-level ACME config support for backward compatibility
|
||
|
||
## HTTP-01 ACME Challenge Fix (v19.3.8)
|
||
|
||
### Issue
|
||
Non-TLS connections on ports configured in `useHttpProxy` were not being forwarded to HttpProxy. This caused ACME HTTP-01 challenges to fail when the ACME port (usually 80) was included in `useHttpProxy`.
|
||
|
||
### Root Cause
|
||
In the `RouteConnectionHandler.handleForwardAction` method, only connections with TLS settings (mode: 'terminate' or 'terminate-and-reencrypt') were being forwarded to HttpProxy. Non-TLS connections were always handled as direct connections, even when the port was configured for HttpProxy.
|
||
|
||
### Solution
|
||
Added a check for non-TLS connections on ports listed in `useHttpProxy`:
|
||
```typescript
|
||
// No TLS settings - check if this port should use HttpProxy
|
||
const isHttpProxyPort = this.settings.useHttpProxy?.includes(record.localPort);
|
||
|
||
if (isHttpProxyPort && this.httpProxyBridge.getHttpProxy()) {
|
||
// Forward non-TLS connections to HttpProxy if configured
|
||
this.httpProxyBridge.forwardToHttpProxy(/*...*/);
|
||
return;
|
||
}
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.http-fix-unit.ts` - Unit tests verifying the fix
|
||
- Tests confirm that non-TLS connections on HttpProxy ports are properly forwarded
|
||
- Tests verify that non-HttpProxy ports still use direct connections
|
||
|
||
### Configuration Example
|
||
```typescript
|
||
const proxy = new SmartProxy({
|
||
useHttpProxy: [80], // Enable HttpProxy for port 80
|
||
httpProxyPort: 8443,
|
||
acme: {
|
||
email: 'ssl@example.com',
|
||
port: 80
|
||
},
|
||
routes: [
|
||
// Your routes here
|
||
]
|
||
});
|
||
```
|
||
|
||
## ACME Certificate Provisioning Timing Fix (v19.3.9)
|
||
|
||
### Issue
|
||
Certificate provisioning would start before ports were listening, causing ACME HTTP-01 challenges to fail with connection refused errors.
|
||
|
||
### Root Cause
|
||
SmartProxy initialization sequence:
|
||
1. Certificate manager initialized → immediately starts provisioning
|
||
2. Ports start listening (too late for ACME challenges)
|
||
|
||
### Solution
|
||
Deferred certificate provisioning until after ports are ready:
|
||
```typescript
|
||
// SmartCertManager.initialize() now skips automatic provisioning
|
||
// SmartProxy.start() calls provisionAllCertificates() directly after ports are listening
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.acme-timing-simple.ts` - Verifies proper timing sequence
|
||
|
||
### Migration
|
||
Update to v19.3.9+, no configuration changes needed.
|
||
|
||
## Socket Handler Race Condition Fix (v19.5.0)
|
||
|
||
### Issue
|
||
Initial data chunks were being emitted before async socket handlers had completed setup, causing data loss when handlers performed async operations before setting up data listeners.
|
||
|
||
### Root Cause
|
||
The `handleSocketHandlerAction` method was using `process.nextTick` to emit initial chunks regardless of whether the handler was sync or async. This created a race condition where async handlers might not have their listeners ready when the initial data was emitted.
|
||
|
||
### Solution
|
||
Differentiated between sync and async handlers:
|
||
```typescript
|
||
const result = route.action.socketHandler(socket);
|
||
|
||
if (result instanceof Promise) {
|
||
// Async handler - wait for completion before emitting initial data
|
||
result.then(() => {
|
||
if (initialChunk && initialChunk.length > 0) {
|
||
socket.emit('data', initialChunk);
|
||
}
|
||
}).catch(/*...*/);
|
||
} else {
|
||
// Sync handler - use process.nextTick as before
|
||
if (initialChunk && initialChunk.length > 0) {
|
||
process.nextTick(() => {
|
||
socket.emit('data', initialChunk);
|
||
});
|
||
}
|
||
}
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.socket-handler-race.ts` - Specifically tests async handlers with delayed listener setup
|
||
- Verifies that initial data is received even when handler sets up listeners after async work
|
||
|
||
### Usage Note
|
||
Socket handlers require initial data from the client to trigger routing (not just a TLS handshake). Clients must send at least one byte of data for the handler to be invoked.
|
||
|
||
## Route-Specific Security Implementation (v19.5.3)
|
||
|
||
### Issue
|
||
Route-specific security configurations (ipAllowList, ipBlockList, authentication) were defined in the route types but not enforced at runtime.
|
||
|
||
### Root Cause
|
||
The RouteConnectionHandler only checked global IP validation but didn't enforce route-specific security rules after matching a route.
|
||
|
||
### Solution
|
||
Added security checks after route matching:
|
||
```typescript
|
||
// Apply route-specific security checks
|
||
const routeSecurity = route.action.security || route.security;
|
||
if (routeSecurity) {
|
||
// Check IP allow/block lists
|
||
if (routeSecurity.ipAllowList || routeSecurity.ipBlockList) {
|
||
const isIPAllowed = this.securityManager.isIPAuthorized(
|
||
remoteIP,
|
||
routeSecurity.ipAllowList || [],
|
||
routeSecurity.ipBlockList || []
|
||
);
|
||
|
||
if (!isIPAllowed) {
|
||
socket.end();
|
||
this.connectionManager.cleanupConnection(record, 'route_ip_blocked');
|
||
return;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.route-security-unit.ts` - Unit tests verifying SecurityManager.isIPAuthorized logic
|
||
- Tests confirm IP allow/block lists work correctly with glob patterns
|
||
|
||
### Configuration Example
|
||
```typescript
|
||
const routes: IRouteConfig[] = [{
|
||
name: 'secure-api',
|
||
match: { ports: 8443, domains: 'api.example.com' },
|
||
action: {
|
||
type: 'forward',
|
||
target: { host: 'localhost', port: 3000 },
|
||
security: {
|
||
ipAllowList: ['192.168.1.*', '10.0.0.0/8'], // Allow internal IPs
|
||
ipBlockList: ['192.168.1.100'], // But block specific IP
|
||
maxConnections: 100, // Per-route limit (TODO)
|
||
authentication: { // HTTP-only, requires TLS termination
|
||
type: 'basic',
|
||
credentials: [{ username: 'api', password: 'secret' }]
|
||
}
|
||
}
|
||
}
|
||
}];
|
||
```
|
||
|
||
### Notes
|
||
- IP lists support glob patterns (via minimatch): `192.168.*`, `10.?.?.1`
|
||
- Block lists take precedence over allow lists
|
||
- Authentication requires TLS termination (cannot be enforced on passthrough/direct connections)
|
||
- Per-route connection limits are not yet implemented
|
||
- Security is defined at the route level (route.security), not in the action
|
||
- Route matching is based solely on match criteria; security is enforced after matching
|
||
|
||
## Performance Issues Investigation (v19.5.3+)
|
||
|
||
### Critical Blocking Operations Found
|
||
1. **Busy Wait Loop** in `ts/proxies/nftables-proxy/nftables-proxy.ts:235-238`
|
||
- Blocks entire event loop with `while (Date.now() < waitUntil) {}`
|
||
- Should use `await new Promise(resolve => setTimeout(resolve, delay))`
|
||
|
||
2. **Synchronous Filesystem Operations**
|
||
- Certificate management uses `fs.existsSync()`, `fs.mkdirSync()`, `fs.readFileSync()`
|
||
- NFTables proxy uses `execSync()` for system commands
|
||
- Certificate store uses `ensureDirSync()`, `fileExistsSync()`, `removeManySync()`
|
||
|
||
3. **Memory Leak Risks**
|
||
- Several `setInterval()` calls without storing references for cleanup
|
||
- Event listeners added without proper cleanup in error paths
|
||
- Missing `removeAllListeners()` calls in some connection cleanup scenarios
|
||
|
||
### Performance Recommendations
|
||
- Replace all sync filesystem operations with async alternatives
|
||
- Fix the busy wait loop immediately (critical event loop blocker)
|
||
- Add proper cleanup for all timers and event listeners
|
||
- Consider worker threads for CPU-intensive operations
|
||
- See `readme.problems.md` for detailed analysis and recommendations
|
||
|
||
## Performance Optimizations Implemented (Phase 1 - v19.6.0)
|
||
|
||
### 1. Async Utilities Created (`ts/core/utils/async-utils.ts`)
|
||
- **delay()**: Non-blocking alternative to busy wait loops
|
||
- **retryWithBackoff()**: Retry operations with exponential backoff
|
||
- **withTimeout()**: Execute operations with timeout protection
|
||
- **parallelLimit()**: Run async operations with concurrency control
|
||
- **debounceAsync()**: Debounce async functions
|
||
- **AsyncMutex**: Ensure exclusive access to resources
|
||
- **CircuitBreaker**: Protect against cascading failures
|
||
|
||
### 2. Filesystem Utilities Created (`ts/core/utils/fs-utils.ts`)
|
||
- **AsyncFileSystem**: Complete async filesystem operations
|
||
- exists(), ensureDir(), readFile(), writeFile()
|
||
- readJSON(), writeJSON() with proper error handling
|
||
- copyFile(), moveFile(), removeDir()
|
||
- Stream creation and file listing utilities
|
||
|
||
### 3. Critical Fixes Applied
|
||
|
||
#### Busy Wait Loop Fixed
|
||
- **Location**: `ts/proxies/nftables-proxy/nftables-proxy.ts:235-238`
|
||
- **Fix**: Replaced `while (Date.now() < waitUntil) {}` with `await delay(ms)`
|
||
- **Impact**: Unblocks event loop, massive performance improvement
|
||
|
||
#### Certificate Manager Migration
|
||
- **File**: `ts/proxies/http-proxy/certificate-manager.ts`
|
||
- Added async initialization method
|
||
- Kept sync methods for backward compatibility with deprecation warnings
|
||
- Added `loadDefaultCertificatesAsync()` method
|
||
|
||
#### Certificate Store Migration
|
||
- **File**: `ts/proxies/smart-proxy/cert-store.ts`
|
||
- Replaced all `fileExistsSync`, `ensureDirSync`, `removeManySync`
|
||
- Used parallel operations with `Promise.all()` for better performance
|
||
- Improved error handling and async JSON operations
|
||
|
||
#### NFTables Proxy Improvements
|
||
- Added deprecation warnings to sync methods
|
||
- Created `executeWithTempFile()` helper for common pattern
|
||
- Started migration of sync filesystem operations to async
|
||
- Added import for delay and AsyncFileSystem utilities
|
||
|
||
### 4. Backward Compatibility Maintained
|
||
- All sync methods retained with deprecation warnings
|
||
- Existing APIs unchanged, new async methods added alongside
|
||
- Feature flags prepared for gradual rollout
|
||
|
||
### 5. Phase 1 Completion Status
|
||
✅ **Phase 1 COMPLETE** - All critical performance fixes have been implemented:
|
||
- ✅ Fixed busy wait loop in nftables-proxy.ts
|
||
- ✅ Created async utilities (delay, retry, timeout, parallelLimit, mutex, circuit breaker)
|
||
- ✅ Created filesystem utilities (AsyncFileSystem with full async operations)
|
||
- ✅ Migrated all certificate management to async operations
|
||
- ✅ Migrated nftables-proxy filesystem operations to async (except stopSync for exit handlers)
|
||
- ✅ All tests passing for new utilities
|
||
|
||
### 6. Phase 2 Progress Status
|
||
🔨 **Phase 2 IN PROGRESS** - Resource Lifecycle Management:
|
||
- ✅ Created LifecycleComponent base class for automatic resource cleanup
|
||
- ✅ Created BinaryHeap data structure for priority queue operations
|
||
- ✅ Created EnhancedConnectionPool with backpressure and health checks
|
||
- ✅ Cleaned up legacy code (removed ts/common/, event-utils.ts, event-system.ts)
|
||
- 📋 TODO: Migrate existing components to extend LifecycleComponent
|
||
- 📋 TODO: Add integration tests for resource management
|
||
|
||
### 7. Next Steps (Remaining Work)
|
||
- **Phase 2 (cont)**: Migrate components to use LifecycleComponent
|
||
- **Phase 3**: Add worker threads for CPU-intensive operations
|
||
- **Phase 4**: Performance monitoring dashboard
|
||
|
||
## Socket Error Handling Fix (v19.5.11+)
|
||
|
||
### Issue
|
||
Server crashed with unhandled 'error' event when backend connections failed (ECONNREFUSED). Also caused memory leak with rising active connection count as failed connections weren't cleaned up properly.
|
||
|
||
### Root Cause
|
||
1. **Race Condition**: In forwarding handlers, sockets were created with `net.connect()` but error handlers were attached later, creating a window where errors could crash the server
|
||
2. **Incomplete Cleanup**: When server connections failed, client sockets weren't properly cleaned up, leaving connection records in memory
|
||
|
||
### Solution
|
||
Created `createSocketWithErrorHandler()` utility that attaches error handlers immediately:
|
||
```typescript
|
||
// Before (race condition):
|
||
const socket = net.connect(port, host);
|
||
// ... other code ...
|
||
socket.on('error', handler); // Too late!
|
||
|
||
// After (safe):
|
||
const socket = createSocketWithErrorHandler({
|
||
port, host,
|
||
onError: (error) => {
|
||
// Handle error immediately
|
||
clientSocket.destroy();
|
||
},
|
||
onConnect: () => {
|
||
// Set up forwarding
|
||
}
|
||
});
|
||
```
|
||
|
||
### Changes Made
|
||
1. **New Utility**: `ts/core/utils/socket-utils.ts` - Added `createSocketWithErrorHandler()`
|
||
2. **Updated Handlers**:
|
||
- `https-passthrough-handler.ts` - Uses safe socket creation
|
||
- `https-terminate-to-http-handler.ts` - Uses safe socket creation
|
||
3. **Connection Cleanup**: Client sockets destroyed immediately on server connection failure
|
||
|
||
### Test Coverage
|
||
- `test/test.socket-error-handling.node.ts` - Verifies server doesn't crash on ECONNREFUSED
|
||
- `test/test.forwarding-error-fix.node.ts` - Tests forwarding handlers handle errors gracefully
|
||
|
||
### Configuration
|
||
No configuration changes needed. The fix is transparent to users.
|
||
|
||
### Important Note
|
||
The fix was applied in two places:
|
||
1. **ForwardingHandler classes** (`https-passthrough-handler.ts`, etc.) - These are standalone forwarding utilities
|
||
2. **SmartProxy route-connection-handler** (`route-connection-handler.ts`) - This is where the actual SmartProxy connection handling happens
|
||
|
||
The critical fix for SmartProxy was in `setupDirectConnection()` method in route-connection-handler.ts, which now uses `createSocketWithErrorHandler()` to properly handle connection failures and clean up connection records.
|
||
|
||
## Connection Cleanup Improvements (v19.5.12+)
|
||
|
||
### Issue
|
||
Connections were still counting up during rapid retry scenarios, especially when routing failed or backend connections were refused. This was due to:
|
||
1. **Delayed Cleanup**: Using `initiateCleanupOnce` queued cleanup operations (batch of 100 every 100ms) instead of immediate cleanup
|
||
2. **NFTables Memory Leak**: NFTables connections were never cleaned up, staying in memory forever
|
||
3. **Connection Limit Bypass**: When max connections reached, connection record check happened after creation
|
||
|
||
### Root Cause Analysis
|
||
1. **Queued vs Immediate Cleanup**:
|
||
- `initiateCleanupOnce()`: Adds to cleanup queue, processes up to 100 connections every 100ms
|
||
- `cleanupConnection()`: Immediate synchronous cleanup
|
||
- Under rapid retries, connections were created faster than the queue could process them
|
||
|
||
2. **NFTables Connections**:
|
||
- Marked with `usingNetworkProxy = true` but never cleaned up
|
||
- Connection records stayed in memory indefinitely
|
||
|
||
3. **Error Path Cleanup**:
|
||
- Many error paths used `socket.end()` (async) followed by cleanup
|
||
- Created timing windows where connections weren't fully cleaned
|
||
|
||
### Solution
|
||
1. **Immediate Cleanup**: Changed all error paths from `initiateCleanupOnce()` to `cleanupConnection()` for immediate cleanup
|
||
2. **NFTables Cleanup**: Added socket close listener to clean up connection records when NFTables connections close
|
||
3. **Connection Limit Fix**: Added null check after `createConnection()` to handle rejection properly
|
||
|
||
### Changes Made in route-connection-handler.ts
|
||
```typescript
|
||
// 1. NFTables cleanup (line 551-553)
|
||
socket.once('close', () => {
|
||
this.connectionManager.cleanupConnection(record, 'nftables_closed');
|
||
});
|
||
|
||
// 2. Connection limit check (line 93-96)
|
||
const record = this.connectionManager.createConnection(socket);
|
||
if (!record) {
|
||
// Connection was rejected due to limit - socket already destroyed
|
||
return;
|
||
}
|
||
|
||
// 3. Changed all error paths to use immediate cleanup
|
||
// Before: this.connectionManager.initiateCleanupOnce(record, reason)
|
||
// After: this.connectionManager.cleanupConnection(record, reason)
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.rapid-retry-cleanup.node.ts` - Verifies connection cleanup under rapid retry scenarios
|
||
- Test shows connection count stays at 0 even with 20 rapid retries with 50ms intervals
|
||
- Confirms both ECONNREFUSED and routing failure scenarios are handled correctly
|
||
|
||
### Performance Impact
|
||
- **Positive**: No more connection accumulation under load
|
||
- **Positive**: Immediate cleanup reduces memory usage
|
||
- **Consideration**: More frequent cleanup operations, but prevents queue backlog
|
||
|
||
### Migration Notes
|
||
No configuration changes needed. The improvements are automatic and backward compatible.
|
||
|
||
## Early Client Disconnect Handling (v19.5.13+)
|
||
|
||
### Issue
|
||
Connections were accumulating when clients connected but disconnected before sending data or during routing. This occurred in two scenarios:
|
||
1. **TLS Path**: Clients connecting and disconnecting before sending initial TLS handshake data
|
||
2. **Non-TLS Immediate Routing**: Clients disconnecting while backend connection was being established
|
||
|
||
### Root Cause
|
||
1. **Missing Cleanup Handlers**: During initial data wait and immediate routing, no close/end handlers were attached to catch early disconnections
|
||
2. **Race Condition**: Backend connection attempts continued even after client disconnected, causing unhandled errors
|
||
3. **Timing Window**: Between accepting connection and establishing full bidirectional flow, disconnections weren't properly handled
|
||
|
||
### Solution
|
||
1. **TLS Path Fix**: Added close/end handlers during initial data wait (lines 224-253 in route-connection-handler.ts)
|
||
2. **Immediate Routing Fix**: Used `setupSocketHandlers` for proper handler attachment (lines 180-205)
|
||
3. **Backend Error Handling**: Check if connection already closed before handling backend errors (line 1144)
|
||
|
||
### Changes Made
|
||
```typescript
|
||
// 1. TLS path - handle disconnect before initial data
|
||
socket.once('close', () => {
|
||
if (!initialDataReceived) {
|
||
this.connectionManager.cleanupConnection(record, 'closed_before_data');
|
||
}
|
||
});
|
||
|
||
// 2. Immediate routing path - proper handler setup
|
||
setupSocketHandlers(socket, (reason) => {
|
||
if (!record.outgoing || record.outgoing.readyState !== 'open') {
|
||
if (record.outgoing && !record.outgoing.destroyed) {
|
||
record.outgoing.destroy(); // Abort pending backend connection
|
||
}
|
||
this.connectionManager.cleanupConnection(record, reason);
|
||
}
|
||
}, undefined, 'immediate-route-client');
|
||
|
||
// 3. Backend connection error handling
|
||
onError: (error) => {
|
||
if (record.connectionClosed) {
|
||
logger.log('debug', 'Backend connection failed but client already disconnected');
|
||
return; // Client already gone, nothing to clean up
|
||
}
|
||
// ... normal error handling
|
||
}
|
||
```
|
||
|
||
### Test Coverage
|
||
- `test/test.connect-disconnect-cleanup.node.ts` - Comprehensive test for early disconnect scenarios
|
||
- Tests verify connection count stays at 0 even with rapid connect/disconnect patterns
|
||
- Covers immediate disconnect, delayed disconnect, and mixed patterns
|
||
|
||
### Performance Impact
|
||
- **Positive**: No more connection accumulation from early disconnects
|
||
- **Positive**: Immediate cleanup reduces memory usage
|
||
- **Positive**: Prevents resource exhaustion from rapid reconnection attempts
|
||
|
||
### Migration Notes
|
||
No configuration changes needed. The fix is automatic and backward compatible. |