add plan for better cert provisioning
This commit is contained in:
310
readme.plan.md
310
readme.plan.md
@ -1,53 +1,281 @@
|
|||||||
# SmartProxy Connection Limiting Improvements Plan
|
# SmartProxy Implementation Plan
|
||||||
|
|
||||||
Command to re-read CLAUDE.md: `cat /home/philkunz/.claude/CLAUDE.md`
|
## Feature: Custom Certificate Provision Function
|
||||||
|
|
||||||
## Issues Identified
|
### Summary
|
||||||
|
This plan implements the `certProvisionFunction` feature that allows users to provide their own certificate generation logic. The function can either return a custom certificate or delegate back to Let's Encrypt by returning 'http01'.
|
||||||
|
|
||||||
1. **HttpProxy Bypass**: Connections forwarded to HttpProxy for TLS termination only check global limits, not per-IP limits
|
### Key Changes
|
||||||
2. **Missing Route-Level Connection Enforcement**: Routes can define `security.maxConnections` but it's never enforced
|
1. Add `certProvisionFunction` support to CertificateManager
|
||||||
3. **Cleanup Queue Race Condition**: New connections can be added to cleanup queue while processing
|
2. Modify `provisionAcmeCertificate()` to check custom function first
|
||||||
4. **IP Tracking Memory Optimization**: IP entries remain in map even without active connections
|
3. Add certificate expiry parsing for custom certificates
|
||||||
|
4. Support both initial provisioning and renewal
|
||||||
|
5. Add fallback configuration option
|
||||||
|
|
||||||
## Implementation Steps
|
### Overview
|
||||||
|
Implement the `certProvisionFunction` callback that's defined in the interface but currently not implemented. This will allow users to provide custom certificate generation logic while maintaining backward compatibility with the existing Let's Encrypt integration.
|
||||||
|
|
||||||
### 1. Fix HttpProxy Per-IP Validation ✓
|
### Requirements
|
||||||
- [x] Pass IP information to HttpProxy when forwarding connections
|
1. The function should be called for any new certificate provisioning or renewal
|
||||||
- [x] Add per-IP validation in HttpProxy connection handler
|
2. Must support returning custom certificates or falling back to Let's Encrypt
|
||||||
- [x] Ensure connection tracking is consistent between SmartProxy and HttpProxy
|
3. Should integrate seamlessly with the existing certificate lifecycle
|
||||||
|
4. Must maintain backward compatibility
|
||||||
|
|
||||||
### 2. Implement Route-Level Connection Limits ✓
|
### Implementation Steps
|
||||||
- [x] Add connection count tracking per route in ConnectionManager
|
|
||||||
- [x] Update SharedSecurityManager.isAllowed() to check route-specific maxConnections
|
|
||||||
- [x] Add route connection limit validation in route-connection-handler.ts
|
|
||||||
|
|
||||||
### 3. Fix Cleanup Queue Race Condition ✓
|
#### 1. Update Certificate Manager to Support Custom Provision Function
|
||||||
- [x] Implement proper queue snapshotting before processing
|
**File**: `ts/proxies/smart-proxy/certificate-manager.ts`
|
||||||
- [x] Ensure new connections added during processing aren't missed
|
|
||||||
- [x] Add proper synchronization for cleanup operations
|
|
||||||
|
|
||||||
### 4. Optimize IP Tracking Memory Usage ✓
|
- [ ] Add `certProvisionFunction` property to CertificateManager class
|
||||||
- [x] Add periodic cleanup for IPs with no active connections
|
- [ ] Pass the function from SmartProxy options during initialization
|
||||||
- [x] Implement expiry for rate limit timestamps
|
- [ ] Modify `provisionCertificate()` method to check for custom function first
|
||||||
- [x] Add memory-efficient data structures for IP tracking
|
|
||||||
|
|
||||||
### 5. Add Comprehensive Tests ✓
|
#### 2. Implement Custom Certificate Provisioning Logic
|
||||||
- [x] Test per-IP limits with HttpProxy forwarding
|
**Location**: Modify `provisionAcmeCertificate()` method
|
||||||
- [x] Test route-level connection limits
|
|
||||||
- [x] Test cleanup queue edge cases
|
|
||||||
- [x] Test memory usage with many unique IPs
|
|
||||||
|
|
||||||
### 6. Log Deduplication for High-Volume Scenarios ✓
|
```typescript
|
||||||
- [x] Implement LogDeduplicator utility for batching similar events
|
private async provisionAcmeCertificate(
|
||||||
- [x] Add deduplication for connection rejections, terminations, and cleanups
|
route: IRouteConfig,
|
||||||
- [x] Include rejection reasons in IP rejection summaries
|
domains: string[]
|
||||||
- [x] Provide aggregated summaries with meaningful context
|
): Promise<void> {
|
||||||
|
const primaryDomain = domains[0];
|
||||||
|
const routeName = route.name || primaryDomain;
|
||||||
|
|
||||||
|
// Check for custom provision function first
|
||||||
|
if (this.certProvisionFunction) {
|
||||||
|
try {
|
||||||
|
logger.log('info', `Attempting custom certificate provision for ${primaryDomain}`, { domain: primaryDomain });
|
||||||
|
const result = await this.certProvisionFunction(primaryDomain);
|
||||||
|
|
||||||
|
if (result === 'http01') {
|
||||||
|
logger.log('info', `Custom function returned 'http01', falling back to Let's Encrypt for ${primaryDomain}`);
|
||||||
|
// Continue with existing ACME logic below
|
||||||
|
} else {
|
||||||
|
// Use custom certificate
|
||||||
|
const customCert = result as plugins.tsclass.network.ICert;
|
||||||
|
|
||||||
|
// Convert to internal certificate format
|
||||||
|
const certData: ICertificateData = {
|
||||||
|
cert: customCert.cert,
|
||||||
|
key: customCert.key,
|
||||||
|
ca: customCert.ca || '',
|
||||||
|
issueDate: new Date(),
|
||||||
|
expiryDate: this.extractExpiryDate(customCert.cert)
|
||||||
|
};
|
||||||
|
|
||||||
|
// Store and apply certificate
|
||||||
|
await this.certStore.saveCertificate(routeName, certData);
|
||||||
|
await this.applyCertificate(primaryDomain, certData);
|
||||||
|
this.updateCertStatus(routeName, 'valid', 'custom', certData);
|
||||||
|
|
||||||
|
logger.log('info', `Custom certificate applied for ${primaryDomain}`, {
|
||||||
|
domain: primaryDomain,
|
||||||
|
expiryDate: certData.expiryDate
|
||||||
|
});
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
} catch (error) {
|
||||||
|
logger.log('error', `Custom cert provision failed for ${primaryDomain}: ${error.message}`, {
|
||||||
|
domain: primaryDomain,
|
||||||
|
error: error.message
|
||||||
|
});
|
||||||
|
// Configuration option to control fallback behavior
|
||||||
|
if (this.smartProxy.settings.certProvisionFallbackToAcme !== false) {
|
||||||
|
logger.log('info', `Falling back to Let's Encrypt for ${primaryDomain}`);
|
||||||
|
} else {
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Existing Let's Encrypt logic continues here...
|
||||||
|
if (!this.smartAcme) {
|
||||||
|
throw new Error('SmartAcme not initialized...');
|
||||||
|
}
|
||||||
|
// ... rest of existing code
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
## Notes
|
#### 3. Add Helper Method for Certificate Expiry Extraction
|
||||||
|
**New method**: `extractExpiryDate()`
|
||||||
|
|
||||||
- All connection limiting is now consistent across SmartProxy and HttpProxy
|
- [ ] Parse PEM certificate to extract expiry date
|
||||||
- Route-level limits provide additional granular control
|
- [ ] Use existing certificate parsing utilities
|
||||||
- Memory usage is optimized for high-traffic scenarios
|
- [ ] Handle parse errors gracefully
|
||||||
- Comprehensive test coverage ensures reliability
|
|
||||||
- Log deduplication reduces spam during attacks or high-traffic periods
|
```typescript
|
||||||
- IP rejection summaries now include rejection reasons in main message
|
private extractExpiryDate(certPem: string): Date {
|
||||||
|
try {
|
||||||
|
// Use forge or similar library to parse certificate
|
||||||
|
const cert = forge.pki.certificateFromPem(certPem);
|
||||||
|
return cert.validity.notAfter;
|
||||||
|
} catch (error) {
|
||||||
|
// Default to 90 days if parsing fails
|
||||||
|
return new Date(Date.now() + 90 * 24 * 60 * 60 * 1000);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4. Update SmartProxy Initialization
|
||||||
|
**File**: `ts/proxies/smart-proxy/index.ts`
|
||||||
|
|
||||||
|
- [ ] Pass `certProvisionFunction` from options to CertificateManager
|
||||||
|
- [ ] Validate function if provided
|
||||||
|
|
||||||
|
#### 5. Add Type Safety and Validation
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Validate returned certificate has required fields (cert, key, ca)
|
||||||
|
- [ ] Check certificate validity dates
|
||||||
|
- [ ] Ensure certificate matches requested domain
|
||||||
|
|
||||||
|
#### 6. Update Certificate Renewal Logic
|
||||||
|
**Location**: `checkAndRenewCertificates()`
|
||||||
|
|
||||||
|
- [ ] Ensure renewal checks work for both ACME and custom certificates
|
||||||
|
- [ ] Custom certificates should go through the same `provisionAcmeCertificate()` path
|
||||||
|
- [ ] The existing renewal logic already calls `provisionCertificate()` which will use our modified flow
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// No changes needed here - the existing renewal logic will automatically
|
||||||
|
// use the custom provision function when calling provisionCertificate()
|
||||||
|
private async checkAndRenewCertificates(): Promise<void> {
|
||||||
|
// Existing code already handles this correctly
|
||||||
|
for (const route of routes) {
|
||||||
|
if (this.shouldRenewCertificate(cert, renewThreshold)) {
|
||||||
|
// This will call provisionCertificate -> provisionAcmeCertificate
|
||||||
|
// which now includes our custom function check
|
||||||
|
await this.provisionCertificate(route);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 7. Add Integration Tests
|
||||||
|
**File**: `test/test.certificate-provision.ts`
|
||||||
|
|
||||||
|
- [ ] Test custom certificate provision
|
||||||
|
- [ ] Test fallback to Let's Encrypt ('http01' return)
|
||||||
|
- [ ] Test error handling
|
||||||
|
- [ ] Test renewal with custom function
|
||||||
|
|
||||||
|
#### 8. Update Documentation
|
||||||
|
**Files**:
|
||||||
|
- [ ] Update interface documentation
|
||||||
|
- [ ] Add examples to README
|
||||||
|
- [ ] Document ICert structure requirements
|
||||||
|
|
||||||
|
### API Design
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Example usage
|
||||||
|
const proxy = new SmartProxy({
|
||||||
|
certProvisionFunction: async (domain: string) => {
|
||||||
|
// Option 1: Return custom certificate
|
||||||
|
const customCert = await myCustomCA.generateCert(domain);
|
||||||
|
return {
|
||||||
|
cert: customCert.certificate,
|
||||||
|
key: customCert.privateKey,
|
||||||
|
ca: customCert.chain
|
||||||
|
};
|
||||||
|
|
||||||
|
// Option 2: Use Let's Encrypt for certain domains
|
||||||
|
if (domain.endsWith('.internal')) {
|
||||||
|
return customCert;
|
||||||
|
}
|
||||||
|
return 'http01'; // Fallback to Let's Encrypt
|
||||||
|
},
|
||||||
|
certProvisionFallbackToAcme: true, // Default: true
|
||||||
|
routes: [...]
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configuration Options to Add
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
interface ISmartProxyOptions {
|
||||||
|
// Existing options...
|
||||||
|
|
||||||
|
// Custom certificate provision function
|
||||||
|
certProvisionFunction?: (domain: string) => Promise<TSmartProxyCertProvisionObject>;
|
||||||
|
|
||||||
|
// Whether to fallback to ACME if custom provision fails
|
||||||
|
certProvisionFallbackToAcme?: boolean; // Default: true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Error Handling Strategy
|
||||||
|
|
||||||
|
1. **Custom Function Errors**:
|
||||||
|
- Log detailed error with domain context
|
||||||
|
- Option A: Fallback to Let's Encrypt (safer)
|
||||||
|
- Option B: Fail certificate provisioning (stricter)
|
||||||
|
- Make this configurable via option?
|
||||||
|
|
||||||
|
2. **Invalid Certificate Returns**:
|
||||||
|
- Validate certificate structure
|
||||||
|
- Check expiry dates
|
||||||
|
- Verify domain match
|
||||||
|
|
||||||
|
### Testing Plan
|
||||||
|
|
||||||
|
1. **Unit Tests**:
|
||||||
|
- Mock certProvisionFunction returns
|
||||||
|
- Test validation logic
|
||||||
|
- Test error scenarios
|
||||||
|
|
||||||
|
2. **Integration Tests**:
|
||||||
|
- Real certificate generation
|
||||||
|
- Renewal cycle testing
|
||||||
|
- Mixed custom/Let's Encrypt scenarios
|
||||||
|
|
||||||
|
### Backward Compatibility
|
||||||
|
|
||||||
|
- If no `certProvisionFunction` provided, behavior unchanged
|
||||||
|
- Existing routes with 'auto' certificates continue using Let's Encrypt
|
||||||
|
- No breaking changes to existing API
|
||||||
|
|
||||||
|
### Future Enhancements
|
||||||
|
|
||||||
|
1. **Per-Route Custom Functions**:
|
||||||
|
- Allow different provision functions per route
|
||||||
|
- Override global function at route level
|
||||||
|
|
||||||
|
2. **Certificate Events**:
|
||||||
|
- Emit events for custom cert provisioning
|
||||||
|
- Allow monitoring/logging hooks
|
||||||
|
|
||||||
|
3. **Async Certificate Updates**:
|
||||||
|
- Support updating certificates outside renewal cycle
|
||||||
|
- Hot-reload certificates without restart
|
||||||
|
|
||||||
|
### Implementation Notes
|
||||||
|
|
||||||
|
1. **Certificate Status Tracking**:
|
||||||
|
- The `updateCertStatus()` method needs to support a new type: 'custom'
|
||||||
|
- Current types are 'acme' and 'static'
|
||||||
|
- This helps distinguish custom certificates in monitoring/logs
|
||||||
|
|
||||||
|
2. **Certificate Store Integration**:
|
||||||
|
- Custom certificates are stored the same way as ACME certificates
|
||||||
|
- They participate in the same renewal cycle
|
||||||
|
- The store handles persistence across restarts
|
||||||
|
|
||||||
|
3. **Existing Methods to Reuse**:
|
||||||
|
- `applyCertificate()` - Already handles applying certs to routes
|
||||||
|
- `isCertificateValid()` - Can validate custom certificates
|
||||||
|
- `certStore.saveCertificate()` - Handles storage
|
||||||
|
|
||||||
|
### Implementation Priority
|
||||||
|
|
||||||
|
1. Core functionality (steps 1-3)
|
||||||
|
2. Type safety and validation (step 5)
|
||||||
|
3. Renewal support (step 6)
|
||||||
|
4. Tests (step 7)
|
||||||
|
5. Documentation (step 8)
|
||||||
|
|
||||||
|
### Estimated Effort
|
||||||
|
|
||||||
|
- Core implementation: 4-6 hours
|
||||||
|
- Testing: 2-3 hours
|
||||||
|
- Documentation: 1 hour
|
||||||
|
- Total: ~8-10 hours
|
Reference in New Issue
Block a user