Files
dcrouter/readme.metrics.md
Philipp Kunz 93995d5031 Implement Metrics Manager and Integrate Metrics Collection
- Removed the existing readme.opsserver.md file as it is no longer needed.
- Added a new MetricsManager class to handle metrics collection using @push.rocks/smartmetrics.
- Integrated MetricsManager into the DcRouter and OpsServer classes.
- Updated StatsHandler and SecurityHandler to retrieve metrics from MetricsManager.
- Implemented methods for tracking email, DNS, and security metrics.
- Added connection tracking capabilities to the MetricsManager.
- Created a new readme.metrics.md file outlining the metrics implementation plan.
- Adjusted plugins.ts to include smartmetrics.
- Added a new monitoring directory with classes for metrics management.
- Created readme.module-adjustments.md to document necessary adjustments for SmartProxy and SmartDNS.
2025-06-09 16:03:27 +00:00

202 lines
6.1 KiB
Markdown

# Metrics Implementation Plan with @push.rocks/smartmetrics
Command to reread CLAUDE.md: `cat /home/centraluser/eu.central.ingress-2/CLAUDE.md`
## Overview
This plan outlines the migration from placeholder/demo metrics to real metrics using @push.rocks/smartmetrics for the dcrouter project.
## Current State Analysis
### Currently Implemented (Real Data)
- CPU usage (basic calculation from os.loadavg)
- Memory usage (from process.memoryUsage)
- System uptime
### Currently Stubbed (Returns 0 or Demo Data)
- Active connections (HTTP/HTTPS/WebSocket)
- Total connections
- Requests per second
- Email statistics (sent/received/failed/queued/bounce rate)
- DNS statistics (queries/cache hits/response times)
- Security metrics (blocked IPs/auth failures/spam detection)
## Implementation Plan
### Phase 1: Core Infrastructure Setup
1. **Install Dependencies**
```bash
pnpm install --save @push.rocks/smartmetrics
```
2. **Update plugins.ts**
- Add smartmetrics to ts/plugins.ts
- Import as: `import * as smartmetrics from '@push.rocks/smartmetrics';`
3. **Create Metrics Manager Class**
- Location: `ts/monitoring/classes.metricsmanager.ts`
- Initialize SmartMetrics with existing logger
- Configure for dcrouter service identification
- Set up automatic metric collection intervals
### Phase 2: Connection Tracking Implementation
1. **HTTP/HTTPS Connection Tracking**
- Instrument the SmartProxy connection handlers
- Track active connections in real-time
- Monitor connection lifecycle (open/close events)
- Location: Update connection managers in routing system
2. **Email Connection Tracking**
- Instrument SMTP server connection handlers
- Track both incoming and outgoing connections
- Location: `ts/mail/delivery/smtpserver/connection-manager.ts`
3. **DNS Query Tracking**
- Instrument DNS server handlers
- Track query counts and response times
- Location: `ts/mail/routing/classes.dns.manager.ts`
### Phase 3: Email Metrics Collection
1. **Email Processing Metrics**
- Track sent/received/failed emails
- Monitor queue sizes
- Calculate delivery and bounce rates
- Location: Instrument `classes.delivery.queue.ts` and `classes.emailsendjob.ts`
2. **Email Performance Metrics**
- Track processing times
- Monitor queue throughput
- Location: Update delivery system classes
### Phase 4: Security Metrics Integration
1. **Security Event Tracking**
- Track blocked IPs from IPReputationChecker
- Monitor authentication failures
- Count spam/malware/phishing detections
- Location: Instrument security classes in `ts/security/`
### Phase 5: Stats Handler Refactoring
1. **Update Stats Handler**
- Location: `ts/opsserver/handlers/stats.handler.ts`
- Replace all stub implementations with MetricsManager calls
- Maintain existing API interface structure
2. **Metrics Aggregation**
- Implement proper time-window aggregations
- Add historical data storage (last hour/day)
- Calculate rates and percentages accurately
### Phase 6: Prometheus Integration (Optional Enhancement)
1. **Enable Prometheus Endpoint**
- Add Prometheus metrics endpoint
- Configure port (default: 9090)
- Document metrics for monitoring systems
## Implementation Details
### MetricsManager Core Structure
```typescript
export class MetricsManager {
private smartMetrics: smartmetrics.SmartMetrics;
private connectionTrackers: Map<string, ConnectionTracker>;
private emailMetrics: EmailMetricsCollector;
private dnsMetrics: DnsMetricsCollector;
private securityMetrics: SecurityMetricsCollector;
// Real-time counters
private activeConnections = {
http: 0,
https: 0,
websocket: 0,
smtp: 0
};
// Initialize and start collection
public async start(): Promise<void>;
// Get aggregated metrics for stats handler
public async getServerStats(): Promise<IServerStats>;
public async getEmailStats(): Promise<IEmailStats>;
public async getDnsStats(): Promise<IDnsStats>;
public async getSecurityStats(): Promise<ISecurityStats>;
}
```
### Connection Tracking Pattern
```typescript
// Example for HTTP connections
onConnectionOpen(type: string) {
this.activeConnections[type]++;
this.totalConnections[type]++;
}
onConnectionClose(type: string) {
this.activeConnections[type]--;
}
```
### Email Metrics Pattern
```typescript
// Track email events
onEmailSent() { this.emailsSentToday++; }
onEmailReceived() { this.emailsReceivedToday++; }
onEmailFailed() { this.emailsFailedToday++; }
onEmailQueued() { this.queueSize++; }
onEmailDequeued() { this.queueSize--; }
```
## Testing Strategy
1. **Unit Tests**
- Test MetricsManager initialization
- Test metric collection accuracy
- Test aggregation calculations
2. **Integration Tests**
- Test metrics flow from source to API
- Verify real-time updates
- Test under load conditions
3. **Debug Utilities**
- Create `.nogit/debug/test-metrics.ts` for quick testing
- Add metrics dump endpoint for debugging
## Migration Steps
1. Implement MetricsManager without breaking existing code
2. Wire up one metric type at a time
3. Verify each metric shows real data
4. Remove TODO comments from stats handler
5. Update tests to expect real values
## Success Criteria
- [ ] All metrics show real, accurate data
- [ ] No performance degradation
- [ ] Metrics update in real-time
- [ ] Historical data is collected
- [ ] All TODO comments removed from stats handler
- [ ] Tests pass with real metric values
## Notes
- SmartMetrics provides CPU and memory metrics out of the box
- We'll need custom collectors for application-specific metrics
- Consider adding metric persistence for historical data
- Prometheus integration provides industry-standard monitoring
## Questions to Address
1. Should we persist metrics to disk for historical analysis?
2. What time windows should we support (5min, 1hour, 1day)?
3. Should we add alerting thresholds?
4. Do we need custom metric types beyond the current interface?
---
This plan ensures a systematic migration from demo metrics to real, actionable data using @push.rocks/smartmetrics while maintaining the existing API structure and adding powerful monitoring capabilities.