add plan for better cert provisioning

This commit is contained in:
Juergen Kunz
2025-07-12 21:58:46 +00:00
parent f82d44164c
commit 5d206b9800

View File

@ -1,53 +1,281 @@
# SmartProxy Connection Limiting Improvements Plan
# SmartProxy Implementation Plan
Command to re-read CLAUDE.md: `cat /home/philkunz/.claude/CLAUDE.md`
## Feature: Custom Certificate Provision Function
## Issues Identified
### Summary
This plan implements the `certProvisionFunction` feature that allows users to provide their own certificate generation logic. The function can either return a custom certificate or delegate back to Let's Encrypt by returning 'http01'.
1. **HttpProxy Bypass**: Connections forwarded to HttpProxy for TLS termination only check global limits, not per-IP limits
2. **Missing Route-Level Connection Enforcement**: Routes can define `security.maxConnections` but it's never enforced
3. **Cleanup Queue Race Condition**: New connections can be added to cleanup queue while processing
4. **IP Tracking Memory Optimization**: IP entries remain in map even without active connections
### Key Changes
1. Add `certProvisionFunction` support to CertificateManager
2. Modify `provisionAcmeCertificate()` to check custom function first
3. Add certificate expiry parsing for custom certificates
4. Support both initial provisioning and renewal
5. Add fallback configuration option
## Implementation Steps
### Overview
Implement the `certProvisionFunction` callback that's defined in the interface but currently not implemented. This will allow users to provide custom certificate generation logic while maintaining backward compatibility with the existing Let's Encrypt integration.
### 1. Fix HttpProxy Per-IP Validation ✓
- [x] Pass IP information to HttpProxy when forwarding connections
- [x] Add per-IP validation in HttpProxy connection handler
- [x] Ensure connection tracking is consistent between SmartProxy and HttpProxy
### Requirements
1. The function should be called for any new certificate provisioning or renewal
2. Must support returning custom certificates or falling back to Let's Encrypt
3. Should integrate seamlessly with the existing certificate lifecycle
4. Must maintain backward compatibility
### 2. Implement Route-Level Connection Limits ✓
- [x] Add connection count tracking per route in ConnectionManager
- [x] Update SharedSecurityManager.isAllowed() to check route-specific maxConnections
- [x] Add route connection limit validation in route-connection-handler.ts
### Implementation Steps
### 3. Fix Cleanup Queue Race Condition
- [x] Implement proper queue snapshotting before processing
- [x] Ensure new connections added during processing aren't missed
- [x] Add proper synchronization for cleanup operations
#### 1. Update Certificate Manager to Support Custom Provision Function
**File**: `ts/proxies/smart-proxy/certificate-manager.ts`
### 4. Optimize IP Tracking Memory Usage ✓
- [x] Add periodic cleanup for IPs with no active connections
- [x] Implement expiry for rate limit timestamps
- [x] Add memory-efficient data structures for IP tracking
- [ ] Add `certProvisionFunction` property to CertificateManager class
- [ ] Pass the function from SmartProxy options during initialization
- [ ] Modify `provisionCertificate()` method to check for custom function first
### 5. Add Comprehensive Tests ✓
- [x] Test per-IP limits with HttpProxy forwarding
- [x] Test route-level connection limits
- [x] Test cleanup queue edge cases
- [x] Test memory usage with many unique IPs
#### 2. Implement Custom Certificate Provisioning Logic
**Location**: Modify `provisionAcmeCertificate()` method
### 6. Log Deduplication for High-Volume Scenarios ✓
- [x] Implement LogDeduplicator utility for batching similar events
- [x] Add deduplication for connection rejections, terminations, and cleanups
- [x] Include rejection reasons in IP rejection summaries
- [x] Provide aggregated summaries with meaningful context
```typescript
private async provisionAcmeCertificate(
route: IRouteConfig,
domains: string[]
): Promise<void> {
const primaryDomain = domains[0];
const routeName = route.name || primaryDomain;
// Check for custom provision function first
if (this.certProvisionFunction) {
try {
logger.log('info', `Attempting custom certificate provision for ${primaryDomain}`, { domain: primaryDomain });
const result = await this.certProvisionFunction(primaryDomain);
if (result === 'http01') {
logger.log('info', `Custom function returned 'http01', falling back to Let's Encrypt for ${primaryDomain}`);
// Continue with existing ACME logic below
} else {
// Use custom certificate
const customCert = result as plugins.tsclass.network.ICert;
// Convert to internal certificate format
const certData: ICertificateData = {
cert: customCert.cert,
key: customCert.key,
ca: customCert.ca || '',
issueDate: new Date(),
expiryDate: this.extractExpiryDate(customCert.cert)
};
// Store and apply certificate
await this.certStore.saveCertificate(routeName, certData);
await this.applyCertificate(primaryDomain, certData);
this.updateCertStatus(routeName, 'valid', 'custom', certData);
logger.log('info', `Custom certificate applied for ${primaryDomain}`, {
domain: primaryDomain,
expiryDate: certData.expiryDate
});
return;
}
} catch (error) {
logger.log('error', `Custom cert provision failed for ${primaryDomain}: ${error.message}`, {
domain: primaryDomain,
error: error.message
});
// Configuration option to control fallback behavior
if (this.smartProxy.settings.certProvisionFallbackToAcme !== false) {
logger.log('info', `Falling back to Let's Encrypt for ${primaryDomain}`);
} else {
throw error;
}
}
}
// Existing Let's Encrypt logic continues here...
if (!this.smartAcme) {
throw new Error('SmartAcme not initialized...');
}
// ... rest of existing code
}
```
## Notes
#### 3. Add Helper Method for Certificate Expiry Extraction
**New method**: `extractExpiryDate()`
- All connection limiting is now consistent across SmartProxy and HttpProxy
- Route-level limits provide additional granular control
- Memory usage is optimized for high-traffic scenarios
- Comprehensive test coverage ensures reliability
- Log deduplication reduces spam during attacks or high-traffic periods
- IP rejection summaries now include rejection reasons in main message
- [ ] Parse PEM certificate to extract expiry date
- [ ] Use existing certificate parsing utilities
- [ ] Handle parse errors gracefully
```typescript
private extractExpiryDate(certPem: string): Date {
try {
// Use forge or similar library to parse certificate
const cert = forge.pki.certificateFromPem(certPem);
return cert.validity.notAfter;
} catch (error) {
// Default to 90 days if parsing fails
return new Date(Date.now() + 90 * 24 * 60 * 60 * 1000);
}
}
```
#### 4. Update SmartProxy Initialization
**File**: `ts/proxies/smart-proxy/index.ts`
- [ ] Pass `certProvisionFunction` from options to CertificateManager
- [ ] Validate function if provided
#### 5. Add Type Safety and Validation
**Tasks**:
- [ ] Validate returned certificate has required fields (cert, key, ca)
- [ ] Check certificate validity dates
- [ ] Ensure certificate matches requested domain
#### 6. Update Certificate Renewal Logic
**Location**: `checkAndRenewCertificates()`
- [ ] Ensure renewal checks work for both ACME and custom certificates
- [ ] Custom certificates should go through the same `provisionAcmeCertificate()` path
- [ ] The existing renewal logic already calls `provisionCertificate()` which will use our modified flow
```typescript
// No changes needed here - the existing renewal logic will automatically
// use the custom provision function when calling provisionCertificate()
private async checkAndRenewCertificates(): Promise<void> {
// Existing code already handles this correctly
for (const route of routes) {
if (this.shouldRenewCertificate(cert, renewThreshold)) {
// This will call provisionCertificate -> provisionAcmeCertificate
// which now includes our custom function check
await this.provisionCertificate(route);
}
}
}
```
#### 7. Add Integration Tests
**File**: `test/test.certificate-provision.ts`
- [ ] Test custom certificate provision
- [ ] Test fallback to Let's Encrypt ('http01' return)
- [ ] Test error handling
- [ ] Test renewal with custom function
#### 8. Update Documentation
**Files**:
- [ ] Update interface documentation
- [ ] Add examples to README
- [ ] Document ICert structure requirements
### API Design
```typescript
// Example usage
const proxy = new SmartProxy({
certProvisionFunction: async (domain: string) => {
// Option 1: Return custom certificate
const customCert = await myCustomCA.generateCert(domain);
return {
cert: customCert.certificate,
key: customCert.privateKey,
ca: customCert.chain
};
// Option 2: Use Let's Encrypt for certain domains
if (domain.endsWith('.internal')) {
return customCert;
}
return 'http01'; // Fallback to Let's Encrypt
},
certProvisionFallbackToAcme: true, // Default: true
routes: [...]
});
```
### Configuration Options to Add
```typescript
interface ISmartProxyOptions {
// Existing options...
// Custom certificate provision function
certProvisionFunction?: (domain: string) => Promise<TSmartProxyCertProvisionObject>;
// Whether to fallback to ACME if custom provision fails
certProvisionFallbackToAcme?: boolean; // Default: true
}
```
### Error Handling Strategy
1. **Custom Function Errors**:
- Log detailed error with domain context
- Option A: Fallback to Let's Encrypt (safer)
- Option B: Fail certificate provisioning (stricter)
- Make this configurable via option?
2. **Invalid Certificate Returns**:
- Validate certificate structure
- Check expiry dates
- Verify domain match
### Testing Plan
1. **Unit Tests**:
- Mock certProvisionFunction returns
- Test validation logic
- Test error scenarios
2. **Integration Tests**:
- Real certificate generation
- Renewal cycle testing
- Mixed custom/Let's Encrypt scenarios
### Backward Compatibility
- If no `certProvisionFunction` provided, behavior unchanged
- Existing routes with 'auto' certificates continue using Let's Encrypt
- No breaking changes to existing API
### Future Enhancements
1. **Per-Route Custom Functions**:
- Allow different provision functions per route
- Override global function at route level
2. **Certificate Events**:
- Emit events for custom cert provisioning
- Allow monitoring/logging hooks
3. **Async Certificate Updates**:
- Support updating certificates outside renewal cycle
- Hot-reload certificates without restart
### Implementation Notes
1. **Certificate Status Tracking**:
- The `updateCertStatus()` method needs to support a new type: 'custom'
- Current types are 'acme' and 'static'
- This helps distinguish custom certificates in monitoring/logs
2. **Certificate Store Integration**:
- Custom certificates are stored the same way as ACME certificates
- They participate in the same renewal cycle
- The store handles persistence across restarts
3. **Existing Methods to Reuse**:
- `applyCertificate()` - Already handles applying certs to routes
- `isCertificateValid()` - Can validate custom certificates
- `certStore.saveCertificate()` - Handles storage
### Implementation Priority
1. Core functionality (steps 1-3)
2. Type safety and validation (step 5)
3. Renewal support (step 6)
4. Tests (step 7)
5. Documentation (step 8)
### Estimated Effort
- Core implementation: 4-6 hours
- Testing: 2-3 hours
- Documentation: 1 hour
- Total: ~8-10 hours