fix(metrics): fix metrics
This commit is contained in:
@ -2,6 +2,20 @@
|
||||
|
||||
## Byte Tracking and Metrics
|
||||
|
||||
### Throughput Drift Issue (Fixed)
|
||||
|
||||
**Problem**: Throughput numbers were gradually increasing over time for long-lived connections.
|
||||
|
||||
**Root Cause**: The `byRoute()` and `byIP()` methods were dividing cumulative total bytes (since connection start) by the window duration, causing rates to appear higher as connections aged:
|
||||
- Hour 1: 1GB total / 60s = 17 MB/s ✓
|
||||
- Hour 2: 2GB total / 60s = 34 MB/s ✗ (appears doubled!)
|
||||
- Hour 3: 3GB total / 60s = 50 MB/s ✗ (keeps rising!)
|
||||
|
||||
**Solution**: Implemented snapshot-based byte tracking that calculates actual bytes transferred within each time window:
|
||||
- Store periodic snapshots of byte counts with timestamps
|
||||
- Calculate delta between window start and end snapshots
|
||||
- Divide delta by window duration for accurate throughput
|
||||
|
||||
### What Gets Counted (Network Interface Throughput)
|
||||
|
||||
The byte tracking is designed to match network interface throughput (what Unifi/network monitoring tools show):
|
||||
@ -41,10 +55,13 @@ The byte tracking is designed to match network interface throughput (what Unifi/
|
||||
|
||||
The metrics system has three layers:
|
||||
1. **Connection Records** (`record.bytesReceived/bytesSent`): Track total bytes per connection
|
||||
2. **ThroughputTracker**: Accumulates bytes between samples for rate calculations (bytes/second)
|
||||
3. **connectionByteTrackers**: Track bytes per connection with timestamps for per-route/IP metrics
|
||||
2. **ThroughputTracker**: Accumulates bytes between samples for global rate calculations (resets each second)
|
||||
3. **connectionByteTrackers**: Track bytes per connection with snapshots for accurate windowed per-route/IP metrics
|
||||
|
||||
Total byte counts come from connection records only, preventing double counting.
|
||||
Key features:
|
||||
- Global throughput uses sampling with accumulator reset (accurate)
|
||||
- Per-route/IP throughput uses snapshots to calculate window-specific deltas (accurate)
|
||||
- All byte counting happens exactly once at the data flow point
|
||||
|
||||
### Understanding "High" Byte Counts
|
||||
|
||||
|
Reference in New Issue
Block a user