smartproxy/readme.problems.md

170 lines
5.9 KiB
Markdown
Raw Normal View History

# SmartProxy Performance Issues Report
## Executive Summary
This report identifies performance issues and blocking operations in the SmartProxy codebase that could impact scalability and responsiveness under high load.
## Critical Issues
### 1. **Synchronous Filesystem Operations**
These operations block the event loop and should be replaced with async alternatives:
#### Certificate Management
- `ts/proxies/http-proxy/certificate-manager.ts:29`: `fs.existsSync()`
- `ts/proxies/http-proxy/certificate-manager.ts:30`: `fs.mkdirSync()`
- `ts/proxies/http-proxy/certificate-manager.ts:49-50`: `fs.readFileSync()` for loading certificates
#### NFTables Proxy
- `ts/proxies/nftables-proxy/nftables-proxy.ts`: Multiple uses of `execSync()` for system commands
- `ts/proxies/nftables-proxy/nftables-proxy.ts`: Multiple `fs.writeFileSync()` and `fs.unlinkSync()` operations
#### Certificate Store
- `ts/proxies/smart-proxy/cert-store.ts:8`: `ensureDirSync()`
- `ts/proxies/smart-proxy/cert-store.ts:15,31,76`: `fileExistsSync()`
- `ts/proxies/smart-proxy/cert-store.ts:77`: `removeManySync()`
### 2. **Event Loop Blocking Operations**
#### Busy Wait Loop
- `ts/proxies/nftables-proxy/nftables-proxy.ts:235-238`:
```typescript
const waitUntil = Date.now() + retryDelayMs;
while (Date.now() < waitUntil) {
// busy wait - blocks event loop completely
}
```
This is extremely problematic as it blocks the entire Node.js event loop.
### 3. **Potential Memory Leaks**
#### Timer Management Issues
Several timers are created without proper cleanup:
- `ts/proxies/http-proxy/function-cache.ts`: `setInterval()` without storing reference for cleanup
- `ts/proxies/http-proxy/request-handler.ts`: `setInterval()` for rate limit cleanup without cleanup
- `ts/core/utils/shared-security-manager.ts`: `cleanupInterval` stored but no cleanup method
#### Event Listener Accumulation
- Multiple instances of event listeners being added without corresponding cleanup
- Connection handlers add listeners without always removing them on connection close
### 4. **Connection Pool Management**
#### ConnectionPool (ts/proxies/http-proxy/connection-pool.ts)
**Good practices observed:**
- Proper connection lifecycle management
- Periodic cleanup of idle connections
- Connection limits enforcement
**Potential issues:**
- No backpressure mechanism when pool is full
- Synchronous sorting operation in `cleanupConnectionPool()` could be slow with many connections
### 5. **Resource Management Issues**
#### Socket Cleanup
- Some error paths don't properly clean up sockets
- Missing `removeAllListeners()` in some error scenarios could lead to memory leaks
#### Timeout Management
- Inconsistent timeout handling across different components
- Some sockets created without timeout settings
### 6. **JSON Operations on Large Objects**
- `ts/proxies/smart-proxy/cert-store.ts:21`: `JSON.parse()` on certificate metadata
- `ts/proxies/smart-proxy/cert-store.ts:71`: `JSON.stringify()` with pretty printing
- `ts/proxies/http-proxy/function-cache.ts:76`: `JSON.stringify()` for cache keys (called frequently)
2025-05-29 12:15:53 +00:00
2025-05-29 11:30:42 +00:00
## Recommendations
### Immediate Actions (High Priority)
1. **Replace Synchronous Operations**
```typescript
// Instead of:
if (fs.existsSync(path)) { ... }
// Use:
try {
await fs.promises.access(path);
// file exists
} catch {
// file doesn't exist
}
```
2. **Fix Busy Wait Loop**
```typescript
// Instead of:
while (Date.now() < waitUntil) { }
// Use:
await new Promise(resolve => setTimeout(resolve, retryDelayMs));
```
3. **Add Timer Cleanup**
```typescript
class Component {
private cleanupTimer?: NodeJS.Timeout;
start() {
this.cleanupTimer = setInterval(() => { ... }, 60000);
}
stop() {
if (this.cleanupTimer) {
clearInterval(this.cleanupTimer);
this.cleanupTimer = undefined;
}
}
}
```
### Medium Priority
1. **Optimize JSON Operations**
- Cache JSON.stringify results for frequently used objects
- Consider using faster hashing for cache keys (e.g., crypto.createHash)
- Use streaming JSON parsers for large objects
2. **Improve Connection Pool**
- Implement backpressure/queueing when pool is full
- Use a heap or priority queue for connection management instead of sorting
3. **Standardize Resource Cleanup**
- Create a base class for components with lifecycle management
- Ensure all event listeners are removed on cleanup
- Add abort controllers for better cancellation support
### Long-term Improvements
1. **Worker Threads**
- Move CPU-intensive operations to worker threads
- Consider using worker pools for NFTables operations
2. **Monitoring and Metrics**
- Add performance monitoring for event loop lag
- Track connection pool utilization
- Monitor memory usage patterns
3. **Graceful Degradation**
- Implement circuit breakers for backend connections
- Add request queuing with overflow protection
- Implement adaptive timeout strategies
## Impact Assessment
These issues primarily affect:
- **Scalability**: Blocking operations limit concurrent connection handling
- **Responsiveness**: Event loop blocking causes latency spikes
- **Stability**: Memory leaks could cause crashes under sustained load
- **Resource Usage**: Inefficient resource management increases memory/CPU usage
## Testing Recommendations
1. Load test with high connection counts (10k+ concurrent)
2. Monitor event loop lag under stress
3. Test long-running scenarios to detect memory leaks
4. Benchmark with async vs sync operations to measure improvement
## Conclusion
While SmartProxy has good architectural design and many best practices, the identified blocking operations and resource management issues could significantly impact performance under high load. The most critical issues (busy wait loop and synchronous filesystem operations) should be addressed immediately.