173 lines
5.4 KiB
Markdown
173 lines
5.4 KiB
Markdown
|
# Module Adjustments for Metrics Collection
|
||
|
|
||
|
Command to reread CLAUDE.md: `cat /home/centraluser/eu.central.ingress-2/CLAUDE.md`
|
||
|
|
||
|
## SmartProxy Adjustments
|
||
|
|
||
|
### Current State
|
||
|
SmartProxy (@push.rocks/smartproxy) provides:
|
||
|
- Route-level `maxConnections` limiting
|
||
|
- Event emission system (currently only for certificates)
|
||
|
- NFTables integration with packet statistics
|
||
|
- Connection monitoring during active sessions
|
||
|
|
||
|
### Missing Capabilities for Metrics
|
||
|
1. **No Connection Lifecycle Events**
|
||
|
- No `connection-open` or `connection-close` events
|
||
|
- No way to track active connections in real-time
|
||
|
- No exposure of internal connection tracking
|
||
|
|
||
|
2. **No Statistics API**
|
||
|
- No methods like `getActiveConnections()` or `getConnectionStats()`
|
||
|
- No access to connection counts per route
|
||
|
- No throughput or performance metrics exposed
|
||
|
|
||
|
3. **Limited Event System**
|
||
|
- Currently only emits certificate-related events
|
||
|
- No connection, request, or performance events
|
||
|
|
||
|
### Required Adjustments
|
||
|
1. **Add Connection Tracking Events**
|
||
|
```typescript
|
||
|
// Emit on new connection
|
||
|
smartProxy.emit('connection-open', {
|
||
|
type: 'http' | 'https' | 'websocket',
|
||
|
routeName: string,
|
||
|
clientIp: string,
|
||
|
timestamp: Date
|
||
|
});
|
||
|
|
||
|
// Emit on connection close
|
||
|
smartProxy.emit('connection-close', {
|
||
|
connectionId: string,
|
||
|
duration: number,
|
||
|
bytesTransferred: number
|
||
|
});
|
||
|
```
|
||
|
|
||
|
2. **Add Statistics API**
|
||
|
```typescript
|
||
|
interface IProxyStats {
|
||
|
getActiveConnections(): number;
|
||
|
getConnectionsByRoute(): Map<string, number>;
|
||
|
getTotalConnections(): number;
|
||
|
getRequestsPerSecond(): number;
|
||
|
getThroughput(): { bytesIn: number, bytesOut: number };
|
||
|
}
|
||
|
```
|
||
|
|
||
|
3. **Expose Internal Metrics**
|
||
|
- Make connection pools accessible
|
||
|
- Expose route-level statistics
|
||
|
- Provide request/response metrics
|
||
|
|
||
|
### Alternative Approach
|
||
|
Since SmartProxy is already used with socket handlers for email routing, we could:
|
||
|
1. Wrap all SmartProxy socket handlers with a metrics-aware wrapper
|
||
|
2. Use the existing socket-handler pattern to intercept all connections
|
||
|
3. Track connections at the dcrouter level rather than modifying SmartProxy
|
||
|
|
||
|
## SmartDNS Adjustments
|
||
|
|
||
|
### Current State
|
||
|
SmartDNS (@push.rocks/smartdns) provides:
|
||
|
- DNS query handling via registered handlers
|
||
|
- Support for UDP (port 53) and DNS-over-HTTPS
|
||
|
- Domain pattern matching and routing
|
||
|
- DNSSEC support
|
||
|
|
||
|
### Missing Capabilities for Metrics
|
||
|
1. **No Query Tracking**
|
||
|
- No counters for total queries
|
||
|
- No breakdown by query type (A, AAAA, MX, etc.)
|
||
|
- No domain popularity tracking
|
||
|
|
||
|
2. **No Performance Metrics**
|
||
|
- No response time tracking
|
||
|
- No cache hit/miss statistics
|
||
|
- No error rate tracking
|
||
|
|
||
|
3. **No Event Emission**
|
||
|
- No query lifecycle events
|
||
|
- No cache events
|
||
|
- No error events
|
||
|
|
||
|
### Required Adjustments
|
||
|
1. **Add Query Interceptor/Middleware**
|
||
|
```typescript
|
||
|
// Wrap handler registration to add metrics
|
||
|
smartDns.use((query, next) => {
|
||
|
metricsCollector.trackQuery(query);
|
||
|
const startTime = Date.now();
|
||
|
|
||
|
next((response) => {
|
||
|
metricsCollector.trackResponse(response, Date.now() - startTime);
|
||
|
});
|
||
|
});
|
||
|
```
|
||
|
|
||
|
2. **Add Event Emissions**
|
||
|
```typescript
|
||
|
// Query events
|
||
|
smartDns.emit('query-received', {
|
||
|
type: query.type,
|
||
|
domain: query.domain,
|
||
|
source: 'udp' | 'https',
|
||
|
clientIp: string
|
||
|
});
|
||
|
|
||
|
smartDns.emit('query-answered', {
|
||
|
cached: boolean,
|
||
|
responseTime: number,
|
||
|
responseCode: string
|
||
|
});
|
||
|
```
|
||
|
|
||
|
3. **Add Statistics API**
|
||
|
```typescript
|
||
|
interface IDnsStats {
|
||
|
getTotalQueries(): number;
|
||
|
getQueriesPerSecond(): number;
|
||
|
getCacheStats(): { hits: number, misses: number, hitRate: number };
|
||
|
getTopDomains(limit: number): Array<{ domain: string, count: number }>;
|
||
|
getQueryTypeBreakdown(): Record<string, number>;
|
||
|
}
|
||
|
```
|
||
|
|
||
|
### Alternative Approach
|
||
|
Since we control the handler registration in dcrouter:
|
||
|
1. Create a metrics-aware handler wrapper at the dcrouter level
|
||
|
2. Wrap all DNS handlers before registration
|
||
|
3. Track metrics without modifying SmartDNS itself
|
||
|
|
||
|
## Implementation Strategy
|
||
|
|
||
|
### Option 1: Fork and Modify Dependencies
|
||
|
- Fork @push.rocks/smartproxy and @push.rocks/smartdns
|
||
|
- Add metrics capabilities directly
|
||
|
- Maintain custom versions
|
||
|
- **Pros**: Clean integration, full control
|
||
|
- **Cons**: Maintenance burden, divergence from upstream
|
||
|
|
||
|
### Option 2: Wrapper Approach at DcRouter Level
|
||
|
- Create wrapper classes that intercept all operations
|
||
|
- Track metrics at the application level
|
||
|
- No modifications to dependencies
|
||
|
- **Pros**: No dependency modifications, easier to maintain
|
||
|
- **Cons**: May miss some internal events, slightly higher overhead
|
||
|
|
||
|
### Option 3: Contribute Back to Upstream
|
||
|
- Submit PRs to add metrics capabilities to original packages
|
||
|
- Work with maintainers to add event emissions and stats APIs
|
||
|
- **Pros**: Benefits everyone, no fork maintenance
|
||
|
- **Cons**: Slower process, may not align with maintainer vision
|
||
|
|
||
|
## Recommendation
|
||
|
|
||
|
**Use Option 2 (Wrapper Approach)** for immediate implementation:
|
||
|
1. Create `MetricsAwareProxy` and `MetricsAwareDns` wrapper classes
|
||
|
2. Intercept all operations and track metrics
|
||
|
3. Minimal changes to existing codebase
|
||
|
4. Can migrate to Option 3 later if upstream accepts contributions
|
||
|
|
||
|
This approach allows us to implement comprehensive metrics collection without modifying external dependencies, maintaining compatibility and reducing maintenance burden.
|