fix(certificate-manager, smart-proxy): Fix race condition in ACME certificate provisioning and refactor certificate manager initialization to defer provisioning until after port listeners are active
This commit is contained in:
100
docs/acme-timing-fix.md
Normal file
100
docs/acme-timing-fix.md
Normal file
@ -0,0 +1,100 @@
|
||||
# ACME Certificate Provisioning Timing Fix (v19.3.9)
|
||||
|
||||
## Problem Description
|
||||
|
||||
In SmartProxy v19.3.8 and earlier, ACME certificate provisioning would start immediately during SmartProxy initialization, before the required ports were actually listening. This caused ACME HTTP-01 challenges to fail because the challenge port (typically port 80) was not ready to accept connections when Let's Encrypt tried to validate the challenge.
|
||||
|
||||
## Root Cause
|
||||
|
||||
The certificate manager was initialized and immediately started provisioning certificates as part of the SmartProxy startup sequence:
|
||||
|
||||
1. SmartProxy.start() called
|
||||
2. Certificate manager initialized
|
||||
3. Certificate provisioning started immediately (including ACME challenges)
|
||||
4. Port listeners started afterwards
|
||||
5. ACME challenges would fail because port 80 wasn't listening yet
|
||||
|
||||
This race condition meant that when Let's Encrypt tried to connect to port 80 to validate the HTTP-01 challenge, the connection would be refused.
|
||||
|
||||
## Solution
|
||||
|
||||
The fix defers certificate provisioning until after all ports are listening and ready:
|
||||
|
||||
### Changes to SmartCertManager
|
||||
|
||||
```typescript
|
||||
// Modified initialize() to skip automatic provisioning
|
||||
public async initialize(): Promise<void> {
|
||||
// ... initialization code ...
|
||||
|
||||
// Skip automatic certificate provisioning during initialization
|
||||
console.log('Certificate manager initialized. Deferring certificate provisioning until after ports are listening.');
|
||||
|
||||
// Start renewal timer
|
||||
this.startRenewalTimer();
|
||||
}
|
||||
|
||||
// Made provisionAllCertificates public to allow direct calling after ports are ready
|
||||
public async provisionAllCertificates(): Promise<void> {
|
||||
// ... certificate provisioning code ...
|
||||
}
|
||||
```
|
||||
|
||||
### Changes to SmartProxy
|
||||
|
||||
```typescript
|
||||
public async start() {
|
||||
// ... initialization code ...
|
||||
|
||||
// Start port listeners using the PortManager
|
||||
await this.portManager.addPorts(listeningPorts);
|
||||
|
||||
// Now that ports are listening, provision any required certificates
|
||||
if (this.certManager) {
|
||||
console.log('Starting certificate provisioning now that ports are ready');
|
||||
await this.certManager.provisionAllCertificates();
|
||||
}
|
||||
|
||||
// ... rest of startup code ...
|
||||
}
|
||||
```
|
||||
|
||||
## Timing Sequence
|
||||
|
||||
### Before (v19.3.8 and earlier)
|
||||
1. Initialize certificate manager
|
||||
2. Start ACME provisioning immediately
|
||||
3. ACME challenge fails (port not ready)
|
||||
4. Start port listeners
|
||||
5. Port 80 now listening (too late)
|
||||
|
||||
### After (v19.3.9)
|
||||
1. Initialize certificate manager (provisioning deferred)
|
||||
2. Start port listeners
|
||||
3. Port 80 now listening
|
||||
4. Start ACME provisioning
|
||||
5. ACME challenge succeeds
|
||||
|
||||
## Configuration
|
||||
|
||||
No configuration changes are required. The timing fix is automatic and transparent to users.
|
||||
|
||||
## Testing
|
||||
|
||||
The fix is verified by the test in `test/test.acme-timing-simple.ts` which ensures:
|
||||
|
||||
1. Certificate manager is initialized first
|
||||
2. Ports start listening
|
||||
3. Certificate provisioning happens only after ports are ready
|
||||
|
||||
## Impact
|
||||
|
||||
This fix ensures that:
|
||||
- ACME HTTP-01 challenges succeed on first attempt
|
||||
- No more "connection refused" errors during certificate provisioning
|
||||
- Certificate acquisition is more reliable
|
||||
- No manual retries needed for failed challenges
|
||||
|
||||
## Migration
|
||||
|
||||
Simply update to SmartProxy v19.3.9 or later. The fix is backward compatible and requires no changes to existing code or configuration.
|
Reference in New Issue
Block a user