better logging

This commit is contained in:
Juergen Kunz
2025-07-03 02:32:17 +00:00
parent 67aff4bb30
commit 5d011ba84c
18 changed files with 1604 additions and 410 deletions

View File

@ -183,4 +183,83 @@ The spikes occur because:
1. Use longer window for "instant" measurements (e.g., 5 seconds instead of 1)
2. Track socket write backpressure to estimate actual network flow
3. Implement bandwidth estimation based on connection duration
4. Accept that application-layer != network-layer throughput
4. Accept that application-layer != network-layer throughput
## Connection Limiting
### Per-IP Connection Limits
- SmartProxy tracks connections per IP address in the SecurityManager
- Default limit is 100 connections per IP (configurable via `maxConnectionsPerIP`)
- Connection rate limiting is also enforced (default 300 connections/minute per IP)
- HttpProxy has been enhanced to also enforce per-IP limits when forwarding from SmartProxy
### Route-Level Connection Limits
- Routes can define `security.maxConnections` to limit connections per route
- ConnectionManager tracks connections by route ID using a separate Map
- Limits are enforced in RouteConnectionHandler before forwarding
- Connection is tracked when route is matched: `trackConnectionByRoute(routeId, connectionId)`
### HttpProxy Integration
- When SmartProxy forwards to HttpProxy for TLS termination, it sends a `CLIENT_IP:<ip>\r\n` header
- HttpProxy parses this header to track the real client IP, not the localhost IP
- This ensures per-IP limits are enforced even for forwarded connections
- The header is parsed in the connection handler before any data processing
### Memory Optimization
- Periodic cleanup runs every 60 seconds to remove:
- IPs with no active connections
- Expired rate limit timestamps (older than 1 minute)
- Prevents memory accumulation from many unique IPs over time
- Cleanup is automatic and runs in background with `unref()` to not keep process alive
### Connection Cleanup Queue
- Cleanup queue processes connections in batches to prevent overwhelming the system
- Race condition prevention using `isProcessingCleanup` flag
- Try-finally block ensures flag is always reset even if errors occur
- New connections added during processing are queued for next batch
### Important Implementation Notes
- Always use `NodeJS.Timeout` type instead of `NodeJS.Timer` for interval/timeout references
- IPv4/IPv6 normalization is handled (e.g., `::ffff:127.0.0.1` and `127.0.0.1` are treated as the same IP)
- Connection limits are checked before route matching to prevent DoS attacks
- SharedSecurityManager supports checking route-level limits via optional parameter
## Log Deduplication
To reduce log spam during high-traffic scenarios or attacks, SmartProxy implements log deduplication for repetitive events:
### How It Works
- Similar log events are batched and aggregated over a 5-second window
- Instead of logging each event individually, a summary is emitted
- Events are grouped by type and deduplicated by key (e.g., IP address, reason)
### Deduplicated Event Types
1. **Connection Rejections** (`connection-rejected`):
- Groups by rejection reason (global-limit, route-limit, etc.)
- Example: "Rejected 150 connections (reasons: global-limit: 100, route-limit: 50)"
2. **IP Rejections** (`ip-rejected`):
- Groups by IP address
- Shows top offenders with rejection counts and reasons
- Example: "Rejected 500 connections from 10 IPs (top offenders: 192.168.1.100 (200x, rate-limit), ...)"
3. **Connection Cleanups** (`connection-cleanup`):
- Groups by cleanup reason (normal, timeout, error, zombie, etc.)
- Example: "Cleaned up 250 connections (reasons: normal: 200, timeout: 30, error: 20)"
4. **IP Tracking Cleanup** (`ip-cleanup`):
- Summarizes periodic IP cleanup operations
- Example: "IP tracking cleanup: removed 50 entries across 5 cleanup cycles"
### Configuration
- Default flush interval: 5 seconds
- Maximum batch size: 100 events (triggers immediate flush)
- Global periodic flush: Every 10 seconds (ensures logs are emitted regularly)
- Process exit handling: Logs are flushed on SIGINT/SIGTERM
### Benefits
- Reduces log volume during attacks or high traffic
- Provides better overview of patterns (e.g., which IPs are attacking)
- Improves log readability and analysis
- Prevents log storage overflow
- Maintains detailed information in aggregated form