fix(metrics): fix metrics
This commit is contained in:
@ -142,4 +142,45 @@ Keep-alive connections receive special treatment based on `keepAliveTreatment` s
|
||||
The system supports both receiving and sending PROXY protocol:
|
||||
- **Receiving**: Automatically detected from trusted proxy IPs (configured in `proxyIPs`)
|
||||
- **Sending**: Enabled per-route or globally via `sendProxyProtocol` setting
|
||||
- Real client IP is preserved and used for all connection tracking and security checks
|
||||
- Real client IP is preserved and used for all connection tracking and security checks
|
||||
|
||||
## Metrics and Throughput Calculation
|
||||
|
||||
The metrics system tracks throughput using per-second sampling:
|
||||
|
||||
1. **Byte Recording**: Bytes are recorded as data flows through connections
|
||||
2. **Sampling**: Every second, accumulated bytes are stored as a sample
|
||||
3. **Rate Calculation**: Throughput is calculated by summing bytes over a time window
|
||||
4. **Per-Route/IP Tracking**: Separate ThroughputTracker instances for each route and IP
|
||||
|
||||
Key implementation details:
|
||||
- Bytes are recorded in the bidirectional forwarding callbacks
|
||||
- The instant() method returns throughput over the last 1 second
|
||||
- The recent() method returns throughput over the last 10 seconds
|
||||
- Custom windows can be specified for different averaging periods
|
||||
|
||||
### Throughput Spikes Issue
|
||||
|
||||
There's a fundamental difference between application-layer and network-layer throughput:
|
||||
|
||||
**Application Layer (what we measure)**:
|
||||
- Bytes are recorded when delivered to/from the application
|
||||
- Large chunks can arrive "instantly" due to kernel/Node.js buffering
|
||||
- Shows spikes when buffers are flushed (e.g., 20MB in 1 second = 160 Mbit/s)
|
||||
|
||||
**Network Layer (what Unifi shows)**:
|
||||
- Actual packet flow through the network interface
|
||||
- Limited by physical network speed (e.g., 20 Mbit/s)
|
||||
- Data transfers over time, not in bursts
|
||||
|
||||
The spikes occur because:
|
||||
1. Data flows over network at 20 Mbit/s (takes 8 seconds for 20MB)
|
||||
2. Kernel/Node.js buffers this incoming data
|
||||
3. When buffer is flushed, application receives large chunk at once
|
||||
4. We record entire chunk in current second, creating artificial spike
|
||||
|
||||
**Potential Solutions**:
|
||||
1. Use longer window for "instant" measurements (e.g., 5 seconds instead of 1)
|
||||
2. Track socket write backpressure to estimate actual network flow
|
||||
3. Implement bandwidth estimation based on connection duration
|
||||
4. Accept that application-layer != network-layer throughput
|
Reference in New Issue
Block a user