BREAKING CHANGE(service): expand service lifecycle management with instance-aware hooks, startup timeouts, labels, readiness waits, and auto-restart support

This commit is contained in:
2026-03-21 10:57:27 +00:00
parent 0b78b05101
commit 0f93e86cc1
11 changed files with 3168 additions and 2889 deletions

269
readme.md
View File

@@ -1,6 +1,6 @@
# @push.rocks/taskbuffer 🚀
> **Modern TypeScript task orchestration with constraint-based concurrency, smart buffering, scheduling, labels, and real-time event streaming**
> **Modern TypeScript task orchestration and service lifecycle management with constraint-based concurrency, smart buffering, scheduling, health checks, and real-time event streaming**
[![npm version](https://img.shields.io/npm/v/@push.rocks/taskbuffer.svg)](https://www.npmjs.com/package/@push.rocks/taskbuffer)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/)
@@ -21,6 +21,7 @@ For reporting bugs, issues, or security vulnerabilities, please visit [community
- **🏷️ Labels** — Attach arbitrary `Record<string, string>` metadata (userId, tenantId, etc.) for multi-tenant filtering
- **📡 Push-Based Events** — rxjs `Subject<ITaskEvent>` on every Task and TaskManager for real-time state change notifications
- **🛡️ Error Handling** — Configurable error propagation with `catchErrors`, error tracking, and clear error state
- **🩺 Service Lifecycle Management** — `Service` and `ServiceManager` for long-running components (databases, servers, queues) with health checks, auto-restart, dependency ordering, and instance access
- **🎨 Web Component Dashboard** — Built-in Lit-based dashboard for real-time task visualization
- **🌐 Distributed Coordination** — Abstract coordinator for multi-instance task deduplication
@@ -788,6 +789,196 @@ manager.descheduleTaskByName('Deploy'); // Remove cron schedule only
manager.removeConstraintGroup('domain-mutex'); // By name
```
## 🩺 Service Lifecycle Management
For long-running components like database connections, HTTP servers, and message queues, taskbuffer provides `Service` and `ServiceManager` — a complete lifecycle management system with health checks, dependency ordering, retry, auto-restart, and typed instance access.
### Basic Service — Builder Pattern
```typescript
import { Service, ServiceManager } from '@push.rocks/taskbuffer';
const dbService = new Service<DatabasePool>('Database')
.critical()
.withStart(async () => {
const pool = new DatabasePool({ host: 'localhost', port: 5432 });
await pool.connect();
return pool; // stored as service.instance
})
.withStop(async (pool) => {
await pool.disconnect(); // receives the instance from start
})
.withHealthCheck(async (pool) => {
return await pool.ping(); // receives the instance too
});
await dbService.start();
dbService.instance!.query('SELECT 1'); // typed access to the pool
await dbService.stop();
```
The `start()` return value is stored as `service.instance` and automatically passed to `stop()` and `healthCheck()` functions — no need for external closures or shared variables.
### Service with Dependencies & Health Checks
```typescript
const cacheService = new Service('Redis')
.optional()
.withStart(async () => new RedisClient())
.withStop(async (client) => client.quit())
.withHealthCheck(async (client) => client.isReady, {
intervalMs: 10000, // check every 10s
timeoutMs: 3000, // 3s timeout per check
failuresBeforeDegraded: 3, // 3 consecutive failures → 'degraded'
failuresBeforeFailed: 5, // 5 consecutive failures → 'failed'
autoRestart: true, // auto-restart when failed
maxAutoRestarts: 5, // give up after 5 restart attempts
autoRestartDelayMs: 2000, // start with 2s delay
autoRestartBackoffFactor: 2, // double delay each attempt
});
const apiService = new Service('API')
.critical()
.dependsOn('Database', 'Redis')
.withStart(async () => {
const server = createServer();
await server.listen(3000);
return server;
})
.withStop(async (server) => server.close())
.withStartupTimeout(10000); // fail if start takes > 10s
```
### ServiceManager — Orchestration
`ServiceManager` handles dependency-ordered startup, failure isolation, and aggregated health reporting:
```typescript
const manager = new ServiceManager({
name: 'MyApp',
startupTimeoutMs: 60000, // global startup timeout
shutdownTimeoutMs: 15000, // per-service shutdown timeout
defaultRetry: { maxRetries: 3, baseDelayMs: 1000, backoffFactor: 2 },
});
manager.addService(dbService);
manager.addService(cacheService);
manager.addService(apiService);
await manager.start();
// ✅ Starts Database first, then Redis (parallel with DB since independent),
// then API (after both deps are running)
// ❌ If Database (critical) fails → rollback, stop everything, throw
// ⚠️ If Redis (optional) fails → log warning, continue, health = 'degraded'
// Health aggregation
const health = manager.getHealth();
// { overall: 'healthy', services: [...], startedAt: 1706284800000, uptime: 42000 }
// Cascade restart — stops dependents first, restarts target, then restarts dependents
await manager.restartService('Database');
// Graceful reverse-order shutdown
await manager.stop();
```
### Subclass Pattern
For complex services, extend `Service` and override the lifecycle hooks:
```typescript
class PostgresService extends Service<Pool> {
constructor(private config: PoolConfig) {
super('Postgres');
this.critical();
}
protected async serviceStart(): Promise<Pool> {
const pool = new Pool(this.config);
await pool.connect();
return pool;
}
protected async serviceStop(): Promise<void> {
await this.instance?.end();
}
protected async serviceHealthCheck(): Promise<boolean> {
const result = await this.instance?.query('SELECT 1');
return result?.rows.length === 1;
}
}
```
### Waiting for Service Readiness
Programmatically wait for a service to reach a specific state:
```typescript
// Wait for the service to be running (with timeout)
await dbService.waitForRunning(10000);
// Wait for any state
await service.waitForState(['running', 'degraded'], 5000);
// Wait for shutdown
await service.waitForStopped();
```
### Service Labels
Tag services with metadata for filtering and grouping:
```typescript
const service = new Service('Redis')
.withLabels({ type: 'cache', env: 'production', region: 'eu-west' })
.withStart(async () => new RedisClient())
.withStop(async (client) => client.quit());
// Query by label in ServiceManager
const caches = manager.getServicesByLabel('type', 'cache');
const prodStatuses = manager.getServicesStatusByLabel('env', 'production');
```
### Service Events
Every `Service` emits events via an rxjs `Subject<IServiceEvent>`:
```typescript
service.eventSubject.subscribe((event) => {
console.log(`[${event.type}] ${event.serviceName}${event.state}`);
});
// [started] Database → running
// [healthCheck] Database → running
// [degraded] Database → degraded
// [autoRestarting] Database → failed
// [started] Database → running
// [recovered] Database → running
// [stopped] Database → stopped
```
| Event Type | When |
| --- | --- |
| `'started'` | Service started successfully |
| `'stopped'` | Service stopped |
| `'failed'` | Service start failed or health check threshold exceeded |
| `'degraded'` | Health check failures exceeded `failuresBeforeDegraded` |
| `'recovered'` | Health check succeeded while in degraded state |
| `'retrying'` | ServiceManager retrying a failed start attempt |
| `'healthCheck'` | Health check completed (success or failure) |
| `'autoRestarting'` | Auto-restart scheduled after health check failure |
`ServiceManager.serviceSubject` aggregates events from all registered services.
### Service State Machine
```
stopped → starting → running → degraded → failed
↑ ↓ ↓ ↓
└── stopping ←───────────────────┴─────────┘
(auto-restart)
```
## 🎨 Web Component Dashboard
Visualize your tasks in real-time with the included Lit-based web component:
@@ -970,6 +1161,8 @@ const acmeTasks = manager.getTasksMetadataByLabel('tenantId', 'acme');
| `TaskOnce` | Single-execution guard |
| `TaskDebounced` | Debounced task using rxjs |
| `TaskStep` | Step tracking unit (internal, exposed via metadata) |
| `Service<T>` | Long-running component with start/stop lifecycle, health checks, auto-restart, and typed instance access |
| `ServiceManager` | Service orchestrator with dependency ordering, failure isolation, retry, and health aggregation |
### Task Constructor Options
@@ -1080,10 +1273,72 @@ const acmeTasks = manager.getTasksMetadataByLabel('tenantId', 'acme');
| `taskMap` | `ObjectMap<Task>` | Internal task registry |
| `constraintGroups` | `TaskConstraintGroup[]` | Registered constraint groups |
### Service Builder Methods
| Method | Returns | Description |
| --- | --- | --- |
| `critical()` | `this` | Mark as critical (startup failure aborts ServiceManager) |
| `optional()` | `this` | Mark as optional (startup failure is tolerated) |
| `dependsOn(...names)` | `this` | Declare dependencies by service name |
| `withStart(fn)` | `this` | Set start function: `() => Promise<T>` |
| `withStop(fn)` | `this` | Set stop function: `(instance: T) => Promise<void>` |
| `withHealthCheck(fn, config?)` | `this` | Set health check: `(instance: T) => Promise<boolean>` |
| `withRetry(config)` | `this` | Set retry config: `{ maxRetries, baseDelayMs, maxDelayMs, backoffFactor }` |
| `withStartupTimeout(ms)` | `this` | Per-service startup timeout |
| `withLabels(labels)` | `this` | Attach key-value labels |
### Service Methods
| Method | Returns | Description |
| --- | --- | --- |
| `start()` | `Promise<T>` | Start the service (no-op if already running) |
| `stop()` | `Promise<void>` | Stop the service (no-op if already stopped) |
| `checkHealth()` | `Promise<boolean \| undefined>` | Run health check manually |
| `waitForState(target, timeoutMs?)` | `Promise<void>` | Wait for service to reach a state |
| `waitForRunning(timeoutMs?)` | `Promise<void>` | Wait for `'running'` state |
| `waitForStopped(timeoutMs?)` | `Promise<void>` | Wait for `'stopped'` state |
| `getStatus()` | `IServiceStatus` | Full status snapshot |
| `setLabel(key, value)` | `void` | Set a label |
| `getLabel(key)` | `string \| undefined` | Get a label value |
| `removeLabel(key)` | `boolean` | Remove a label |
| `hasLabel(key, value?)` | `boolean` | Check label existence / value |
### Service Properties
| Property | Type | Description |
| --- | --- | --- |
| `name` | `string` | Service identifier |
| `state` | `TServiceState` | Current state (`stopped`, `starting`, `running`, `degraded`, `failed`, `stopping`) |
| `instance` | `T \| undefined` | The value returned from `start()` |
| `criticality` | `TServiceCriticality` | `'critical'` or `'optional'` |
| `dependencies` | `string[]` | Dependency names |
| `labels` | `Record<string, string>` | Attached labels |
| `eventSubject` | `Subject<IServiceEvent>` | rxjs Subject emitting lifecycle events |
| `errorCount` | `number` | Total error count |
| `retryCount` | `number` | Retry attempts during last startup |
### ServiceManager Methods
| Method | Returns | Description |
| --- | --- | --- |
| `addService(service)` | `void` | Register a service |
| `addServiceFromOptions(options)` | `Service<T>` | Create and register from options |
| `removeService(name)` | `void` | Remove service (throws if others depend on it) |
| `start()` | `Promise<void>` | Start all services in dependency order |
| `stop()` | `Promise<void>` | Stop all services in reverse order |
| `restartService(name)` | `Promise<void>` | Cascade restart with dependents |
| `getService(name)` | `Service \| undefined` | Look up by name |
| `getServiceStatus(name)` | `IServiceStatus \| undefined` | Single service status |
| `getAllStatuses()` | `IServiceStatus[]` | All service statuses |
| `getHealth()` | `IServiceManagerHealth` | Aggregated health report |
| `getServicesByLabel(key, value)` | `Service[]` | Filter services by label |
| `getServicesStatusByLabel(key, value)` | `IServiceStatus[]` | Filter statuses by label |
### Exported Types
```typescript
import type {
// Task types
ITaskMetadata,
ITaskExecutionReport,
ITaskExecution,
@@ -1096,6 +1351,18 @@ import type {
IRateLimitConfig,
TResultSharingMode,
StepNames,
// Service types
IServiceOptions,
IServiceStatus,
IServiceEvent,
IServiceManagerOptions,
IServiceManagerHealth,
IRetryConfig,
IHealthCheckConfig,
TServiceState,
TServiceCriticality,
TServiceEventType,
TOverallHealth,
} from '@push.rocks/taskbuffer';
```