Files
smartproxy/readme.metrics.md
Juergen Kunz 9bf15ff756 feat(metrics): add comprehensive metrics collection system
Implement real-time stats tracking including connection counts, request metrics, bandwidth usage, and route-specific monitoring. Adds MetricsCollector with observable streams for reactive monitoring integration.
2025-06-09 15:08:37 +00:00

18 KiB

SmartProxy Metrics Implementation Plan

This document outlines the plan for implementing comprehensive metrics tracking in SmartProxy.

Overview

The metrics system will provide real-time insights into proxy performance, connection statistics, and throughput data. The implementation will be efficient, thread-safe, and have minimal impact on proxy performance.

Key Design Decisions:

  1. On-demand computation: Instead of maintaining duplicate state, the MetricsCollector computes metrics on-demand from existing data structures.

  2. SmartProxy-centric architecture: MetricsCollector receives the SmartProxy instance, providing access to all components:

    • ConnectionManager for connection data
    • RouteManager for route metadata
    • Settings for configuration
    • Future components without API changes

This approach:

  • Eliminates synchronization issues
  • Reduces memory overhead
  • Simplifies the implementation
  • Guarantees metrics accuracy
  • Leverages existing battle-tested components
  • Provides flexibility for future enhancements

Metrics Interface

interface IProxyStats {
  getActiveConnections(): number;
  getConnectionsByRoute(): Map<string, number>;
  getConnectionsByIP(): Map<string, number>;
  getTotalConnections(): number;
  getRequestsPerSecond(): number;
  getThroughput(): { bytesIn: number, bytesOut: number };
}

Implementation Plan

1. Create MetricsCollector Class

Location: /ts/proxies/smart-proxy/metrics-collector.ts

import type { SmartProxy } from './smart-proxy.js';

export class MetricsCollector implements IProxyStats {
  constructor(
    private smartProxy: SmartProxy
  ) {}
  
  // RPS tracking (the only state we need to maintain)
  private requestTimestamps: number[] = [];
  private readonly RPS_WINDOW_SIZE = 60000; // 1 minute window
  
  // All other metrics are computed on-demand from SmartProxy's components
}

2. Integration Points

Since metrics are computed on-demand from ConnectionManager's records, we only need minimal integration:

A. Request Tracking for RPS

File: /ts/proxies/smart-proxy/route-connection-handler.ts

// In handleNewConnection when a new connection is accepted
this.metricsCollector.recordRequest();

B. SmartProxy Component Access

Through the SmartProxy instance, MetricsCollector can access:

  • smartProxy.connectionManager - All active connections and their details
  • smartProxy.routeManager - Route configurations and metadata
  • smartProxy.settings - Configuration for thresholds and limits
  • smartProxy.servers - Server instances and port information
  • Any other components as needed for future metrics

No additional hooks needed!

3. Metric Implementations

A. Active Connections

getActiveConnections(): number {
  return this.smartProxy.connectionManager.getConnectionCount();
}

B. Connections by Route

getConnectionsByRoute(): Map<string, number> {
  const routeCounts = new Map<string, number>();
  
  // Compute from active connections
  for (const [_, record] of this.smartProxy.connectionManager.getConnections()) {
    const routeName = record.routeName || 'unknown';
    const current = routeCounts.get(routeName) || 0;
    routeCounts.set(routeName, current + 1);
  }
  
  return routeCounts;
}

C. Connections by IP

getConnectionsByIP(): Map<string, number> {
  const ipCounts = new Map<string, number>();
  
  // Compute from active connections
  for (const [_, record] of this.smartProxy.connectionManager.getConnections()) {
    const ip = record.remoteIP;
    const current = ipCounts.get(ip) || 0;
    ipCounts.set(ip, current + 1);
  }
  
  return ipCounts;
}

// Additional helper methods for IP tracking
getTopIPs(limit: number = 10): Array<{ip: string, connections: number}> {
  const ipCounts = this.getConnectionsByIP();
  const sorted = Array.from(ipCounts.entries())
    .sort((a, b) => b[1] - a[1])
    .slice(0, limit)
    .map(([ip, connections]) => ({ ip, connections }));
  
  return sorted;
}

isIPBlocked(ip: string, maxConnectionsPerIP: number): boolean {
  const ipCounts = this.getConnectionsByIP();
  const currentConnections = ipCounts.get(ip) || 0;
  return currentConnections >= maxConnectionsPerIP;
}

D. Total Connections

getTotalConnections(): number {
  // Get from termination stats
  const stats = this.smartProxy.connectionManager.getTerminationStats();
  let total = this.smartProxy.connectionManager.getConnectionCount(); // Add active connections
  
  // Add all terminated connections
  for (const reason in stats.incoming) {
    total += stats.incoming[reason];
  }
  
  return total;
}

E. Requests Per Second

getRequestsPerSecond(): number {
  const now = Date.now();
  const windowStart = now - this.RPS_WINDOW_SIZE;
  
  // Clean old timestamps
  this.requestTimestamps = this.requestTimestamps.filter(ts => ts > windowStart);
  
  // Calculate RPS based on window
  const requestsInWindow = this.requestTimestamps.length;
  return requestsInWindow / (this.RPS_WINDOW_SIZE / 1000);
}

recordRequest(): void {
  this.requestTimestamps.push(Date.now());
  
  // Prevent unbounded growth
  if (this.requestTimestamps.length > 10000) {
    this.cleanupOldRequests();
  }
}

F. Throughput Tracking

getThroughput(): { bytesIn: number, bytesOut: number } {
  let bytesIn = 0;
  let bytesOut = 0;
  
  // Sum bytes from all active connections
  for (const [_, record] of this.smartProxy.connectionManager.getConnections()) {
    bytesIn += record.bytesReceived;
    bytesOut += record.bytesSent;
  }
  
  return { bytesIn, bytesOut };
}

// Get throughput rate (bytes per second) for last minute
getThroughputRate(): { bytesInPerSec: number, bytesOutPerSec: number } {
  const now = Date.now();
  let recentBytesIn = 0;
  let recentBytesOut = 0;
  let connectionCount = 0;
  
  // Calculate bytes transferred in last minute from active connections
  for (const [_, record] of this.smartProxy.connectionManager.getConnections()) {
    const connectionAge = now - record.incomingStartTime;
    if (connectionAge < 60000) { // Connection started within last minute
      recentBytesIn += record.bytesReceived;
      recentBytesOut += record.bytesSent;
      connectionCount++;
    } else {
      // For older connections, estimate rate based on average
      const rate = connectionAge / 60000;
      recentBytesIn += record.bytesReceived / rate;
      recentBytesOut += record.bytesSent / rate;
      connectionCount++;
    }
  }
  
  return {
    bytesInPerSec: Math.round(recentBytesIn / 60),
    bytesOutPerSec: Math.round(recentBytesOut / 60)
  };
}

4. Performance Optimizations

Since metrics are computed on-demand from existing data structures, performance optimizations are minimal:

A. Caching for Frequent Queries

private cachedMetrics: {
  timestamp: number;
  connectionsByRoute?: Map<string, number>;
  connectionsByIP?: Map<string, number>;
} = { timestamp: 0 };

private readonly CACHE_TTL = 1000; // 1 second cache

getConnectionsByRoute(): Map<string, number> {
  const now = Date.now();
  
  // Return cached value if fresh
  if (this.cachedMetrics.connectionsByRoute && 
      now - this.cachedMetrics.timestamp < this.CACHE_TTL) {
    return this.cachedMetrics.connectionsByRoute;
  }
  
  // Compute fresh value
  const routeCounts = new Map<string, number>();
  for (const [_, record] of this.smartProxy.connectionManager.getConnections()) {
    const routeName = record.routeName || 'unknown';
    const current = routeCounts.get(routeName) || 0;
    routeCounts.set(routeName, current + 1);
  }
  
  // Cache and return
  this.cachedMetrics.connectionsByRoute = routeCounts;
  this.cachedMetrics.timestamp = now;
  return routeCounts;
}

B. RPS Cleanup

// Only cleanup needed is for RPS timestamps
private cleanupOldRequests(): void {
  const cutoff = Date.now() - this.RPS_WINDOW_SIZE;
  this.requestTimestamps = this.requestTimestamps.filter(ts => ts > cutoff);
}

5. SmartProxy Integration

A. Add to SmartProxy Class

export class SmartProxy {
  private metricsCollector: MetricsCollector;
  
  constructor(options: ISmartProxyOptions) {
    // ... existing code ...
    
    // Pass SmartProxy instance to MetricsCollector
    this.metricsCollector = new MetricsCollector(this);
  }
  
  // Public API
  public getStats(): IProxyStats {
    return this.metricsCollector;
  }
}

B. Configuration Options

interface ISmartProxyOptions {
  // ... existing options ...
  
  metrics?: {
    enabled?: boolean;           // Default: true
    rpsWindowSize?: number;      // Default: 60000 (1 minute)
    throughputWindowSize?: number; // Default: 60000 (1 minute)
    cleanupInterval?: number;    // Default: 60000 (1 minute)
  };
}

6. Advanced Metrics (Future Enhancement)

interface IAdvancedProxyStats extends IProxyStats {
  // Latency metrics
  getAverageLatency(): number;
  getLatencyPercentiles(): { p50: number, p95: number, p99: number };
  
  // Error metrics
  getErrorRate(): number;
  getErrorsByType(): Map<string, number>;
  
  // Route-specific metrics
  getRouteMetrics(routeName: string): IRouteMetrics;
  
  // Time-series data
  getHistoricalMetrics(duration: number): IHistoricalMetrics;
  
  // Server/Port metrics (leveraging SmartProxy access)
  getPortUtilization(): Map<number, { connections: number, maxConnections: number }>;
  getCertificateExpiry(): Map<string, Date>;
}

// Example implementation showing SmartProxy component access
getPortUtilization(): Map<number, { connections: number, maxConnections: number }> {
  const portStats = new Map();
  
  // Access servers through SmartProxy
  for (const [port, server] of this.smartProxy.servers) {
    const connections = Array.from(this.smartProxy.connectionManager.getConnections())
      .filter(([_, record]) => record.localPort === port).length;
    
    // Access route configuration through SmartProxy
    const routes = this.smartProxy.routeManager.getRoutesForPort(port);
    const maxConnections = routes[0]?.advanced?.maxConnections || 
                          this.smartProxy.settings.defaults?.security?.maxConnections || 
                          10000;
    
    portStats.set(port, { connections, maxConnections });
  }
  
  return portStats;
}

7. HTTP Metrics Endpoint (Optional)

// Expose metrics via HTTP endpoint
class MetricsHttpHandler {
  handleRequest(req: IncomingMessage, res: ServerResponse): void {
    if (req.url === '/metrics') {
      const stats = this.proxy.getStats();
      
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({
        activeConnections: stats.getActiveConnections(),
        totalConnections: stats.getTotalConnections(),
        requestsPerSecond: stats.getRequestsPerSecond(),
        throughput: stats.getThroughput(),
        connectionsByRoute: Object.fromEntries(stats.getConnectionsByRoute()),
        connectionsByIP: Object.fromEntries(stats.getConnectionsByIP()),
        topIPs: stats.getTopIPs(20)
      }));
    }
  }
}

8. Testing Strategy

The simplified design makes testing much easier since we can mock the ConnectionManager's data:

A. Unit Tests

// test/test.metrics-collector.ts
tap.test('MetricsCollector computes metrics correctly', async () => {
  // Mock ConnectionManager with test data
  const mockConnectionManager = {
    getConnectionCount: () => 2,
    getConnections: () => new Map([
      ['conn1', { remoteIP: '192.168.1.1', routeName: 'api', bytesReceived: 1000, bytesSent: 500 }],
      ['conn2', { remoteIP: '192.168.1.1', routeName: 'web', bytesReceived: 2000, bytesSent: 1000 }]
    ]),
    getTerminationStats: () => ({ incoming: { normal: 10, timeout: 2 } })
  };
  
  const collector = new MetricsCollector(mockConnectionManager as any);
  
  expect(collector.getActiveConnections()).toEqual(2);
  expect(collector.getConnectionsByIP().get('192.168.1.1')).toEqual(2);
  expect(collector.getTotalConnections()).toEqual(14); // 2 active + 12 terminated
});

B. Integration Tests

// test/test.metrics-integration.ts
tap.test('SmartProxy provides accurate metrics', async () => {
  const proxy = new SmartProxy({ /* config */ });
  await proxy.start();
  
  // Create connections and verify metrics
  const stats = proxy.getStats();
  expect(stats.getActiveConnections()).toEqual(0);
});

C. Performance Tests

// test/test.metrics-performance.ts
tap.test('Metrics collection has minimal performance impact', async () => {
  // Measure proxy performance with and without metrics
  // Ensure overhead is < 1%
});

9. Implementation Phases

Phase 1: Core Metrics (Days 1-2)

  • Create MetricsCollector class
  • Implement all metric methods (reading from ConnectionManager)
  • Add RPS tracking
  • Add to SmartProxy with getStats() method

Phase 2: Testing & Optimization (Days 3-4)

  • Add comprehensive unit tests with mocked data
  • Add integration tests with real proxy
  • Implement caching for performance
  • Add RPS cleanup mechanism

Phase 3: Advanced Features (Days 5-7)

  • Add HTTP metrics endpoint
  • Implement Prometheus export format
  • Add IP-based rate limiting helpers
  • Create monitoring dashboard example

Note: The simplified design reduces implementation time from 4 weeks to 1 week!

10. Usage Examples

// Basic usage
const proxy = new SmartProxy({
  routes: [...],
  metrics: { enabled: true }
});

await proxy.start();

// Get metrics
const stats = proxy.getStats();
console.log(`Active connections: ${stats.getActiveConnections()}`);
console.log(`RPS: ${stats.getRequestsPerSecond()}`);
console.log(`Throughput: ${JSON.stringify(stats.getThroughput())}`);

// Monitor specific routes
const routeConnections = stats.getConnectionsByRoute();
for (const [route, count] of routeConnections) {
  console.log(`Route ${route}: ${count} connections`);
}

// Monitor connections by IP
const ipConnections = stats.getConnectionsByIP();
for (const [ip, count] of ipConnections) {
  console.log(`IP ${ip}: ${count} connections`);
}

// Get top IPs by connection count
const topIPs = stats.getTopIPs(10);
console.log('Top 10 IPs:', topIPs);

// Check if IP should be rate limited
if (stats.isIPBlocked('192.168.1.100', 100)) {
  console.log('IP has too many connections');
}

11. Monitoring Integration

// Export to monitoring systems
class PrometheusExporter {
  export(stats: IProxyStats): string {
    return `
# HELP smartproxy_active_connections Current number of active connections
# TYPE smartproxy_active_connections gauge
smartproxy_active_connections ${stats.getActiveConnections()}

# HELP smartproxy_total_connections Total connections since start
# TYPE smartproxy_total_connections counter
smartproxy_total_connections ${stats.getTotalConnections()}

# HELP smartproxy_requests_per_second Current requests per second
# TYPE smartproxy_requests_per_second gauge
smartproxy_requests_per_second ${stats.getRequestsPerSecond()}
    `;
  }
}

12. Documentation

  • Add metrics section to main README
  • Create metrics API documentation
  • Add monitoring setup guide
  • Provide dashboard configuration examples

Success Criteria

  1. Performance: Metrics collection adds < 1% overhead
  2. Accuracy: All metrics are accurate within 1% margin
  3. Memory: No memory leaks over 24-hour operation
  4. Thread Safety: No race conditions under high load
  5. Usability: Simple, intuitive API for accessing metrics

Privacy and Security Considerations

IP Address Tracking

  1. Privacy Compliance:

    • Consider GDPR and other privacy regulations when storing IP addresses
    • Implement configurable IP anonymization (e.g., mask last octet)
    • Add option to disable IP tracking entirely
  2. Security:

    • Use IP metrics for rate limiting and DDoS protection
    • Implement automatic blocking for IPs exceeding connection limits
    • Consider integration with IP reputation services
  3. Implementation Options:

interface IMetricsOptions {
  trackIPs?: boolean;              // Default: true
  anonymizeIPs?: boolean;          // Default: false
  maxConnectionsPerIP?: number;    // Default: 100
  ipBlockDuration?: number;        // Default: 3600000 (1 hour)
}

Future Enhancements

  1. Distributed Metrics: Aggregate metrics across multiple proxy instances
  2. Historical Storage: Store metrics in time-series database
  3. Alerting: Built-in alerting based on metric thresholds
  4. Custom Metrics: Allow users to define custom metrics
  5. GraphQL API: Provide GraphQL endpoint for flexible metric queries
  6. IP Analytics:
    • Geographic distribution of connections
    • Automatic anomaly detection for IP patterns
    • Integration with threat intelligence feeds

Benefits of the Simplified Design

By using a SmartProxy-centric architecture with on-demand computation:

  1. Zero Synchronization Issues: Metrics always reflect the true state
  2. Minimal Memory Overhead: No duplicate data structures
  3. Simpler Implementation: ~200 lines instead of ~1000 lines
  4. Easier Testing: Can mock SmartProxy components
  5. Better Performance: No overhead from state updates
  6. Guaranteed Accuracy: Single source of truth
  7. Faster Development: 1 week instead of 4 weeks
  8. Future Flexibility: Access to all SmartProxy components without API changes
  9. Holistic Metrics: Can correlate data across components (connections, routes, settings, certificates, etc.)
  10. Clean Architecture: MetricsCollector is a true SmartProxy component, not an isolated module

This approach leverages the existing, well-tested SmartProxy infrastructure while providing a clean, simple metrics API that can grow with the proxy's capabilities.