Files
smartproxy/readme.websocket-keepalive-fix.md
Juergen Kunz 8347e0fec7
Some checks failed
Default (tags) / security (push) Successful in 45s
Default (tags) / test (push) Failing after 34m50s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
19.6.2
2025-06-09 22:13:56 +00:00

2.2 KiB

WebSocket Keep-Alive Fix for SNI Passthrough

Problem

WebSocket connections in SNI passthrough mode are being disconnected every 30 seconds due to:

  1. WebSocket Heartbeat: The HTTP proxy's WebSocket handler performs heartbeat checks every 30 seconds using ping/pong frames. In SNI passthrough mode, these frames can't be injected into the encrypted stream, causing connections to be marked as inactive and terminated.

  2. Half-Zombie Detection: The connection manager's aggressive cleanup gives only 30 seconds grace period for connections where one socket is destroyed.

Solution

For SNI passthrough connections:

  1. Disable WebSocket-specific heartbeat checking (they're handled as raw TCP)
  2. Rely on TCP keepalive settings instead
  3. Increase grace period for encrypted connections

Current Settings

  • Default inactivity timeout: 4 hours (14400000 ms)
  • Keep-alive multiplier for extended mode: 6x (24 hours)
  • WebSocket heartbeat interval: 30 seconds (problem!)
  • Half-zombie grace period: 30 seconds (too aggressive)
const proxy = new SmartProxy({
  // Increase grace period for connection cleanup
  inactivityTimeout: 14400000, // 4 hours default
  keepAliveTreatment: 'extended', // or 'immortal' for no timeout
  keepAliveInactivityMultiplier: 10, // 40 hours for keepalive connections
  
  // For routes with WebSocket over SNI passthrough
  routes: [
    {
      name: 'websocket-passthrough',
      match: { ports: 443, domains: 'ws.example.com' },
      action: {
        type: 'forward',
        target: { host: 'backend', port: 443 },
        tls: { mode: 'passthrough' },
        // No WebSocket-specific config needed for passthrough
      }
    }
  ]
});

Temporary Workaround

Until a fix is implemented, you can:

  1. Use keepAliveTreatment: 'immortal' to disable timeout-based cleanup
  2. Increase the half-zombie grace period
  3. Use TCP keepalive at the OS level

Proper Fix Implementation

  1. Detect when a connection is SNI passthrough
  2. Skip WebSocket heartbeat for passthrough connections
  3. Increase grace period for encrypted connections
  4. Rely on TCP keepalive instead of application-level ping/pong