- Removed the existing readme.opsserver.md file as it is no longer needed. - Added a new MetricsManager class to handle metrics collection using @push.rocks/smartmetrics. - Integrated MetricsManager into the DcRouter and OpsServer classes. - Updated StatsHandler and SecurityHandler to retrieve metrics from MetricsManager. - Implemented methods for tracking email, DNS, and security metrics. - Added connection tracking capabilities to the MetricsManager. - Created a new readme.metrics.md file outlining the metrics implementation plan. - Adjusted plugins.ts to include smartmetrics. - Added a new monitoring directory with classes for metrics management. - Created readme.module-adjustments.md to document necessary adjustments for SmartProxy and SmartDNS.
202 lines
6.1 KiB
Markdown
202 lines
6.1 KiB
Markdown
# Metrics Implementation Plan with @push.rocks/smartmetrics
|
|
|
|
Command to reread CLAUDE.md: `cat /home/centraluser/eu.central.ingress-2/CLAUDE.md`
|
|
|
|
## Overview
|
|
This plan outlines the migration from placeholder/demo metrics to real metrics using @push.rocks/smartmetrics for the dcrouter project.
|
|
|
|
## Current State Analysis
|
|
|
|
### Currently Implemented (Real Data)
|
|
- CPU usage (basic calculation from os.loadavg)
|
|
- Memory usage (from process.memoryUsage)
|
|
- System uptime
|
|
|
|
### Currently Stubbed (Returns 0 or Demo Data)
|
|
- Active connections (HTTP/HTTPS/WebSocket)
|
|
- Total connections
|
|
- Requests per second
|
|
- Email statistics (sent/received/failed/queued/bounce rate)
|
|
- DNS statistics (queries/cache hits/response times)
|
|
- Security metrics (blocked IPs/auth failures/spam detection)
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Core Infrastructure Setup
|
|
|
|
1. **Install Dependencies**
|
|
```bash
|
|
pnpm install --save @push.rocks/smartmetrics
|
|
```
|
|
|
|
2. **Update plugins.ts**
|
|
- Add smartmetrics to ts/plugins.ts
|
|
- Import as: `import * as smartmetrics from '@push.rocks/smartmetrics';`
|
|
|
|
3. **Create Metrics Manager Class**
|
|
- Location: `ts/monitoring/classes.metricsmanager.ts`
|
|
- Initialize SmartMetrics with existing logger
|
|
- Configure for dcrouter service identification
|
|
- Set up automatic metric collection intervals
|
|
|
|
### Phase 2: Connection Tracking Implementation
|
|
|
|
1. **HTTP/HTTPS Connection Tracking**
|
|
- Instrument the SmartProxy connection handlers
|
|
- Track active connections in real-time
|
|
- Monitor connection lifecycle (open/close events)
|
|
- Location: Update connection managers in routing system
|
|
|
|
2. **Email Connection Tracking**
|
|
- Instrument SMTP server connection handlers
|
|
- Track both incoming and outgoing connections
|
|
- Location: `ts/mail/delivery/smtpserver/connection-manager.ts`
|
|
|
|
3. **DNS Query Tracking**
|
|
- Instrument DNS server handlers
|
|
- Track query counts and response times
|
|
- Location: `ts/mail/routing/classes.dns.manager.ts`
|
|
|
|
### Phase 3: Email Metrics Collection
|
|
|
|
1. **Email Processing Metrics**
|
|
- Track sent/received/failed emails
|
|
- Monitor queue sizes
|
|
- Calculate delivery and bounce rates
|
|
- Location: Instrument `classes.delivery.queue.ts` and `classes.emailsendjob.ts`
|
|
|
|
2. **Email Performance Metrics**
|
|
- Track processing times
|
|
- Monitor queue throughput
|
|
- Location: Update delivery system classes
|
|
|
|
### Phase 4: Security Metrics Integration
|
|
|
|
1. **Security Event Tracking**
|
|
- Track blocked IPs from IPReputationChecker
|
|
- Monitor authentication failures
|
|
- Count spam/malware/phishing detections
|
|
- Location: Instrument security classes in `ts/security/`
|
|
|
|
### Phase 5: Stats Handler Refactoring
|
|
|
|
1. **Update Stats Handler**
|
|
- Location: `ts/opsserver/handlers/stats.handler.ts`
|
|
- Replace all stub implementations with MetricsManager calls
|
|
- Maintain existing API interface structure
|
|
|
|
2. **Metrics Aggregation**
|
|
- Implement proper time-window aggregations
|
|
- Add historical data storage (last hour/day)
|
|
- Calculate rates and percentages accurately
|
|
|
|
### Phase 6: Prometheus Integration (Optional Enhancement)
|
|
|
|
1. **Enable Prometheus Endpoint**
|
|
- Add Prometheus metrics endpoint
|
|
- Configure port (default: 9090)
|
|
- Document metrics for monitoring systems
|
|
|
|
## Implementation Details
|
|
|
|
### MetricsManager Core Structure
|
|
```typescript
|
|
export class MetricsManager {
|
|
private smartMetrics: smartmetrics.SmartMetrics;
|
|
private connectionTrackers: Map<string, ConnectionTracker>;
|
|
private emailMetrics: EmailMetricsCollector;
|
|
private dnsMetrics: DnsMetricsCollector;
|
|
private securityMetrics: SecurityMetricsCollector;
|
|
|
|
// Real-time counters
|
|
private activeConnections = {
|
|
http: 0,
|
|
https: 0,
|
|
websocket: 0,
|
|
smtp: 0
|
|
};
|
|
|
|
// Initialize and start collection
|
|
public async start(): Promise<void>;
|
|
|
|
// Get aggregated metrics for stats handler
|
|
public async getServerStats(): Promise<IServerStats>;
|
|
public async getEmailStats(): Promise<IEmailStats>;
|
|
public async getDnsStats(): Promise<IDnsStats>;
|
|
public async getSecurityStats(): Promise<ISecurityStats>;
|
|
}
|
|
```
|
|
|
|
### Connection Tracking Pattern
|
|
```typescript
|
|
// Example for HTTP connections
|
|
onConnectionOpen(type: string) {
|
|
this.activeConnections[type]++;
|
|
this.totalConnections[type]++;
|
|
}
|
|
|
|
onConnectionClose(type: string) {
|
|
this.activeConnections[type]--;
|
|
}
|
|
```
|
|
|
|
### Email Metrics Pattern
|
|
```typescript
|
|
// Track email events
|
|
onEmailSent() { this.emailsSentToday++; }
|
|
onEmailReceived() { this.emailsReceivedToday++; }
|
|
onEmailFailed() { this.emailsFailedToday++; }
|
|
onEmailQueued() { this.queueSize++; }
|
|
onEmailDequeued() { this.queueSize--; }
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
1. **Unit Tests**
|
|
- Test MetricsManager initialization
|
|
- Test metric collection accuracy
|
|
- Test aggregation calculations
|
|
|
|
2. **Integration Tests**
|
|
- Test metrics flow from source to API
|
|
- Verify real-time updates
|
|
- Test under load conditions
|
|
|
|
3. **Debug Utilities**
|
|
- Create `.nogit/debug/test-metrics.ts` for quick testing
|
|
- Add metrics dump endpoint for debugging
|
|
|
|
## Migration Steps
|
|
|
|
1. Implement MetricsManager without breaking existing code
|
|
2. Wire up one metric type at a time
|
|
3. Verify each metric shows real data
|
|
4. Remove TODO comments from stats handler
|
|
5. Update tests to expect real values
|
|
|
|
## Success Criteria
|
|
|
|
- [ ] All metrics show real, accurate data
|
|
- [ ] No performance degradation
|
|
- [ ] Metrics update in real-time
|
|
- [ ] Historical data is collected
|
|
- [ ] All TODO comments removed from stats handler
|
|
- [ ] Tests pass with real metric values
|
|
|
|
## Notes
|
|
|
|
- SmartMetrics provides CPU and memory metrics out of the box
|
|
- We'll need custom collectors for application-specific metrics
|
|
- Consider adding metric persistence for historical data
|
|
- Prometheus integration provides industry-standard monitoring
|
|
|
|
## Questions to Address
|
|
|
|
1. Should we persist metrics to disk for historical analysis?
|
|
2. What time windows should we support (5min, 1hour, 1day)?
|
|
3. Should we add alerting thresholds?
|
|
4. Do we need custom metric types beyond the current interface?
|
|
|
|
---
|
|
|
|
This plan ensures a systematic migration from demo metrics to real, actionable data using @push.rocks/smartmetrics while maintaining the existing API structure and adding powerful monitoring capabilities. |