Philipp Kunz 02603c3b07 fix(performance): start with planning performance optimizations

2025-05-31 17:14:15 +00:00

5.9 KiB

Raw Blame History

SmartProxy Performance Issues Report

Executive Summary

This report identifies performance issues and blocking operations in the SmartProxy codebase that could impact scalability and responsiveness under high load.

Critical Issues

1. Synchronous Filesystem Operations

These operations block the event loop and should be replaced with async alternatives:

Certificate Management

ts/proxies/http-proxy/certificate-manager.ts:29: fs.existsSync()
ts/proxies/http-proxy/certificate-manager.ts:30: fs.mkdirSync()
ts/proxies/http-proxy/certificate-manager.ts:49-50: fs.readFileSync() for loading certificates

NFTables Proxy

ts/proxies/nftables-proxy/nftables-proxy.ts: Multiple uses of execSync() for system commands
ts/proxies/nftables-proxy/nftables-proxy.ts: Multiple fs.writeFileSync() and fs.unlinkSync() operations

Certificate Store

ts/proxies/smart-proxy/cert-store.ts:8: ensureDirSync()
ts/proxies/smart-proxy/cert-store.ts:15,31,76: fileExistsSync()
ts/proxies/smart-proxy/cert-store.ts:77: removeManySync()

2. Event Loop Blocking Operations

Busy Wait Loop

ts/proxies/nftables-proxy/nftables-proxy.ts:235-238:

const waitUntil = Date.now() + retryDelayMs;
while (Date.now() < waitUntil) {
  // busy wait - blocks event loop completely
}

This is extremely problematic as it blocks the entire Node.js event loop.

3. Potential Memory Leaks

Timer Management Issues

Several timers are created without proper cleanup:

ts/proxies/http-proxy/function-cache.ts: setInterval() without storing reference for cleanup
ts/proxies/http-proxy/request-handler.ts: setInterval() for rate limit cleanup without cleanup
ts/core/utils/shared-security-manager.ts: cleanupInterval stored but no cleanup method

Event Listener Accumulation

Multiple instances of event listeners being added without corresponding cleanup
Connection handlers add listeners without always removing them on connection close

4. Connection Pool Management

ConnectionPool (ts/proxies/http-proxy/connection-pool.ts)

Good practices observed:

Proper connection lifecycle management
Periodic cleanup of idle connections
Connection limits enforcement

Potential issues:

No backpressure mechanism when pool is full
Synchronous sorting operation in cleanupConnectionPool() could be slow with many connections

5. Resource Management Issues

Socket Cleanup

Some error paths don't properly clean up sockets
Missing removeAllListeners() in some error scenarios could lead to memory leaks

Timeout Management

Inconsistent timeout handling across different components
Some sockets created without timeout settings

6. JSON Operations on Large Objects

ts/proxies/smart-proxy/cert-store.ts:21: JSON.parse() on certificate metadata
ts/proxies/smart-proxy/cert-store.ts:71: JSON.stringify() with pretty printing
ts/proxies/http-proxy/function-cache.ts:76: JSON.stringify() for cache keys (called frequently)

Recommendations

Immediate Actions (High Priority)

Replace Synchronous Operations

// Instead of:
if (fs.existsSync(path)) { ... }

// Use:
try {
  await fs.promises.access(path);
  // file exists
} catch {
  // file doesn't exist
}

Fix Busy Wait Loop

// Instead of:
while (Date.now() < waitUntil) { }

// Use:
await new Promise(resolve => setTimeout(resolve, retryDelayMs));

Add Timer Cleanup

class Component {
  private cleanupTimer?: NodeJS.Timeout;

  start() {
    this.cleanupTimer = setInterval(() => { ... }, 60000);
  }

  stop() {
    if (this.cleanupTimer) {
      clearInterval(this.cleanupTimer);
      this.cleanupTimer = undefined;
    }
  }
}

Medium Priority

Optimize JSON Operations
- Cache JSON.stringify results for frequently used objects
- Consider using faster hashing for cache keys (e.g., crypto.createHash)
- Use streaming JSON parsers for large objects
Improve Connection Pool
- Implement backpressure/queueing when pool is full
- Use a heap or priority queue for connection management instead of sorting
Standardize Resource Cleanup
- Create a base class for components with lifecycle management
- Ensure all event listeners are removed on cleanup
- Add abort controllers for better cancellation support

Long-term Improvements

Worker Threads
- Move CPU-intensive operations to worker threads
- Consider using worker pools for NFTables operations
Monitoring and Metrics
- Add performance monitoring for event loop lag
- Track connection pool utilization
- Monitor memory usage patterns
Graceful Degradation
- Implement circuit breakers for backend connections
- Add request queuing with overflow protection
- Implement adaptive timeout strategies

Impact Assessment

These issues primarily affect:

Scalability: Blocking operations limit concurrent connection handling
Responsiveness: Event loop blocking causes latency spikes
Stability: Memory leaks could cause crashes under sustained load
Resource Usage: Inefficient resource management increases memory/CPU usage

Testing Recommendations

Load test with high connection counts (10k+ concurrent)
Monitor event loop lag under stress
Test long-running scenarios to detect memory leaks
Benchmark with async vs sync operations to measure improvement

Conclusion

While SmartProxy has good architectural design and many best practices, the identified blocking operations and resource management issues could significantly impact performance under high load. The most critical issues (busy wait loop and synchronous filesystem operations) should be addressed immediately.

5.9 KiB Raw Blame History