Files
dcrouter/readme.metrics.md
Philipp Kunz 93995d5031 Implement Metrics Manager and Integrate Metrics Collection
- Removed the existing readme.opsserver.md file as it is no longer needed.
- Added a new MetricsManager class to handle metrics collection using @push.rocks/smartmetrics.
- Integrated MetricsManager into the DcRouter and OpsServer classes.
- Updated StatsHandler and SecurityHandler to retrieve metrics from MetricsManager.
- Implemented methods for tracking email, DNS, and security metrics.
- Added connection tracking capabilities to the MetricsManager.
- Created a new readme.metrics.md file outlining the metrics implementation plan.
- Adjusted plugins.ts to include smartmetrics.
- Added a new monitoring directory with classes for metrics management.
- Created readme.module-adjustments.md to document necessary adjustments for SmartProxy and SmartDNS.
2025-06-09 16:03:27 +00:00

6.1 KiB

Metrics Implementation Plan with @push.rocks/smartmetrics

Command to reread CLAUDE.md: cat /home/centraluser/eu.central.ingress-2/CLAUDE.md

Overview

This plan outlines the migration from placeholder/demo metrics to real metrics using @push.rocks/smartmetrics for the dcrouter project.

Current State Analysis

Currently Implemented (Real Data)

  • CPU usage (basic calculation from os.loadavg)
  • Memory usage (from process.memoryUsage)
  • System uptime

Currently Stubbed (Returns 0 or Demo Data)

  • Active connections (HTTP/HTTPS/WebSocket)
  • Total connections
  • Requests per second
  • Email statistics (sent/received/failed/queued/bounce rate)
  • DNS statistics (queries/cache hits/response times)
  • Security metrics (blocked IPs/auth failures/spam detection)

Implementation Plan

Phase 1: Core Infrastructure Setup

  1. Install Dependencies

    pnpm install --save @push.rocks/smartmetrics
    
  2. Update plugins.ts

    • Add smartmetrics to ts/plugins.ts
    • Import as: import * as smartmetrics from '@push.rocks/smartmetrics';
  3. Create Metrics Manager Class

    • Location: ts/monitoring/classes.metricsmanager.ts
    • Initialize SmartMetrics with existing logger
    • Configure for dcrouter service identification
    • Set up automatic metric collection intervals

Phase 2: Connection Tracking Implementation

  1. HTTP/HTTPS Connection Tracking

    • Instrument the SmartProxy connection handlers
    • Track active connections in real-time
    • Monitor connection lifecycle (open/close events)
    • Location: Update connection managers in routing system
  2. Email Connection Tracking

    • Instrument SMTP server connection handlers
    • Track both incoming and outgoing connections
    • Location: ts/mail/delivery/smtpserver/connection-manager.ts
  3. DNS Query Tracking

    • Instrument DNS server handlers
    • Track query counts and response times
    • Location: ts/mail/routing/classes.dns.manager.ts

Phase 3: Email Metrics Collection

  1. Email Processing Metrics

    • Track sent/received/failed emails
    • Monitor queue sizes
    • Calculate delivery and bounce rates
    • Location: Instrument classes.delivery.queue.ts and classes.emailsendjob.ts
  2. Email Performance Metrics

    • Track processing times
    • Monitor queue throughput
    • Location: Update delivery system classes

Phase 4: Security Metrics Integration

  1. Security Event Tracking
    • Track blocked IPs from IPReputationChecker
    • Monitor authentication failures
    • Count spam/malware/phishing detections
    • Location: Instrument security classes in ts/security/

Phase 5: Stats Handler Refactoring

  1. Update Stats Handler

    • Location: ts/opsserver/handlers/stats.handler.ts
    • Replace all stub implementations with MetricsManager calls
    • Maintain existing API interface structure
  2. Metrics Aggregation

    • Implement proper time-window aggregations
    • Add historical data storage (last hour/day)
    • Calculate rates and percentages accurately

Phase 6: Prometheus Integration (Optional Enhancement)

  1. Enable Prometheus Endpoint
    • Add Prometheus metrics endpoint
    • Configure port (default: 9090)
    • Document metrics for monitoring systems

Implementation Details

MetricsManager Core Structure

export class MetricsManager {
  private smartMetrics: smartmetrics.SmartMetrics;
  private connectionTrackers: Map<string, ConnectionTracker>;
  private emailMetrics: EmailMetricsCollector;
  private dnsMetrics: DnsMetricsCollector;
  private securityMetrics: SecurityMetricsCollector;
  
  // Real-time counters
  private activeConnections = {
    http: 0,
    https: 0,
    websocket: 0,
    smtp: 0
  };
  
  // Initialize and start collection
  public async start(): Promise<void>;
  
  // Get aggregated metrics for stats handler
  public async getServerStats(): Promise<IServerStats>;
  public async getEmailStats(): Promise<IEmailStats>;
  public async getDnsStats(): Promise<IDnsStats>;
  public async getSecurityStats(): Promise<ISecurityStats>;
}

Connection Tracking Pattern

// Example for HTTP connections
onConnectionOpen(type: string) {
  this.activeConnections[type]++;
  this.totalConnections[type]++;
}

onConnectionClose(type: string) {
  this.activeConnections[type]--;
}

Email Metrics Pattern

// Track email events
onEmailSent() { this.emailsSentToday++; }
onEmailReceived() { this.emailsReceivedToday++; }
onEmailFailed() { this.emailsFailedToday++; }
onEmailQueued() { this.queueSize++; }
onEmailDequeued() { this.queueSize--; }

Testing Strategy

  1. Unit Tests

    • Test MetricsManager initialization
    • Test metric collection accuracy
    • Test aggregation calculations
  2. Integration Tests

    • Test metrics flow from source to API
    • Verify real-time updates
    • Test under load conditions
  3. Debug Utilities

    • Create .nogit/debug/test-metrics.ts for quick testing
    • Add metrics dump endpoint for debugging

Migration Steps

  1. Implement MetricsManager without breaking existing code
  2. Wire up one metric type at a time
  3. Verify each metric shows real data
  4. Remove TODO comments from stats handler
  5. Update tests to expect real values

Success Criteria

  • All metrics show real, accurate data
  • No performance degradation
  • Metrics update in real-time
  • Historical data is collected
  • All TODO comments removed from stats handler
  • Tests pass with real metric values

Notes

  • SmartMetrics provides CPU and memory metrics out of the box
  • We'll need custom collectors for application-specific metrics
  • Consider adding metric persistence for historical data
  • Prometheus integration provides industry-standard monitoring

Questions to Address

  1. Should we persist metrics to disk for historical analysis?
  2. What time windows should we support (5min, 1hour, 1day)?
  3. Should we add alerting thresholds?
  4. Do we need custom metric types beyond the current interface?

This plan ensures a systematic migration from demo metrics to real, actionable data using @push.rocks/smartmetrics while maintaining the existing API structure and adding powerful monitoring capabilities.