# @push.rocks/smartmigration — Plan > reread /home/philkunz/.claude/CLAUDE.md before working from this plan ## Context `@push.rocks/smartmigration` is a brand-new module that does not yet exist (the directory at `/mnt/data/lossless/push.rocks/smartmigration` is empty except for `.git/`). Its purpose is to **unify migrations across MongoDB and S3** with a small, builder-style API designed to be invoked on **SaaS app startup**: it inspects the current data version, computes the sequential chain of steps required to reach the app's target version, executes them safely, and stamps progress into a ledger so re-runs are no-ops. **Why this needs to exist.** Across the push.rocks ecosystem (`smartdata`, `smartbucket`, `smartdb`, `mongodump`, `smartversion`) there is no migration tooling at all — a search for `migration|migrate|schemaVersion` across the relevant `ts/` trees returned zero hits. SaaS apps that ship multiple deploys per week need a deterministic way to evolve persistent state in lockstep with code, and they need it to "just work" when the app boots — not as a separate operator-driven process. Both `smartdata` and `smartbucket` already expose their underlying drivers (`SmartdataDb.mongoDb` / `SmartdataDb.mongoDbClient`, `SmartBucket.storageClient`), so smartmigration only needs to provide the **runner, ledger, and context plumbing** — it does not need to wrap mongo or S3 itself. **Intended outcome.** A SaaS app can do this at startup and forget about it: ```ts import { SmartMigration } from '@push.rocks/smartmigration'; import { commitinfo } from './00_commitinfo_data.js'; const migration = new SmartMigration({ targetVersion: commitinfo.version, // app version drives data version db, // optional SmartdataDb bucket, // optional SmartBucket Bucket }); migration .step('lowercase-emails') .from('1.0.0').to('1.1.0') .description('Lowercase all user emails') .up(async (ctx) => { await ctx.mongo.collection('users').updateMany( {}, [{ $set: { email: { $toLower: '$email' } } }], ); }) .step('reorganize-uploads') .from('1.1.0').to('2.0.0') .description('Move uploads/ to media/') .resumable() .up(async (ctx) => { const cursor = ctx.bucket.createCursor('uploads/'); let token = await ctx.checkpoint.read('cursorToken'); if (token) cursor.setToken(token); while (await cursor.hasMore()) { for (const key of await cursor.next()) { await ctx.bucket.fastMove({ sourcePath: key, destinationPath: 'media/' + key.slice('uploads/'.length), }); } await ctx.checkpoint.write('cursorToken', cursor.getToken()); } }); await migration.run(); // fast no-op if already at target ``` --- ## Confirmed design decisions These were chosen during planning and are locked in: - **Step ordering:** registration order, with from/to validation. Steps execute in the order they were registered. The runner verifies each step's `from` matches the previous step's `to` (or the current ledger version) and errors out on gaps/overlaps. No DAG / topological sort. - **Rollback:** **up-only for v1.** No `.down()` API. Forward-only migrations are simpler and safer; users restore from backup if a migration goes wrong. May be revisited in v2. --- ## Design at a glance ### Core principles 1. **One unified data version.** A single semver string represents the combined state of mongo + S3. Steps transition `from` → `to`. (Tracking mongo and S3 versions independently is rejected because it explodes step typing for marginal value — the app's data version is what users actually reason about.) 2. **Builder-style step definition, single options object for the runner.** The constructor takes a plain options object (`new SmartMigration({...})`); steps are added via a fluent chain (`migration.step('id').from('1.0.0').to('1.1.0').up(async ctx => {...})`). The chain returns the parent `SmartMigration` after `.up()` so multiple steps chain naturally. 3. **Drivers are exposed via context, not wrapped.** Migration `up` functions receive a `MigrationContext` that hands them both high-level (`ctx.db`, `ctx.bucket`) and raw (`ctx.mongo`, `ctx.s3`) handles. smartmigration writes no SQL or S3 wrappers of its own. 4. **Idempotent, restartable, lockable.** Re-running `migration.run()` on already-applied data is a no-op. Steps marked `.resumable()` get a per-step checkpoint store. A mongo-backed lock prevents concurrent SaaS instances from racing on the same migration. 5. **Fast on the happy path.** When `currentVersion === targetVersion`, `run()` performs **one** read against the ledger and returns. No driver calls beyond that. 6. **Order is registration order; from/to is for validation.** Steps execute in the order they were registered. The runner verifies that each step's `from` matches the previous step's `to` (or the current ledger version) and errors out on gaps/overlaps. This avoids the complexity of computing a DAG path while still catching mistakes. ### Public API surface ```ts // ts/index.ts export { SmartMigration } from './classes.smartmigration.js'; export type { ISmartMigrationOptions, IMigrationStepDefinition, IMigrationContext, IMigrationCheckpoint, IMigrationRunResult, IMigrationStepResult, IMigrationLedgerEntry, } from './interfaces.js'; export type { TMigrationStatus, TLedgerBackend } from './types.js'; export { SmartMigrationError } from './classes.smartmigration.js'; ``` ### `ISmartMigrationOptions` ```ts export interface ISmartMigrationOptions { /** Target version for the data. Typically the app's package.json version. */ targetVersion: string; /** Optional smartdata instance. Required if any step uses ctx.db / ctx.mongo. */ db?: plugins.smartdata.SmartdataDb; /** Optional smartbucket Bucket. Required if any step uses ctx.bucket / ctx.s3. */ bucket?: plugins.smartbucket.Bucket; /** Logical name for this migration ledger. Defaults to "smartmigration". */ ledgerName?: string; /** Where to persist the ledger. Defaults to "mongo" if db provided, otherwise "s3". */ ledgerBackend?: TLedgerBackend; // 'mongo' | 's3' /** * For a fresh install (no ledger AND no app data), jump straight to this version * instead of running every step from the earliest. Defaults to undefined, * which means "run every step from earliest from-version". */ freshInstallVersion?: string; /** How long (ms) to wait for a stale lock from another instance. Default 60_000. */ lockWaitMs?: number; /** How long (ms) before this instance's own lock auto-expires. Default 600_000. */ lockTtlMs?: number; /** If true, run() returns the plan without executing anything. Default false. */ dryRun?: boolean; /** Custom logger. Defaults to module logger. */ logger?: plugins.smartlog.Smartlog; } ``` ### `IMigrationContext` ```ts export interface IMigrationContext { // High-level db?: plugins.smartdata.SmartdataDb; bucket?: plugins.smartbucket.Bucket; // Raw drivers mongo?: plugins.mongodb.Db; // db.mongoDb s3?: plugins.awsSdk.S3Client; // bucket.parentSmartBucket.storageClient // Step metadata step: { id: string; fromVersion: string; toVersion: string; description?: string; isResumable: boolean; }; // Convenience log: plugins.smartlog.Smartlog; isDryRun: boolean; /** Only present when step.isResumable === true. */ checkpoint?: IMigrationCheckpoint; /** Convenience for transactional mongo migrations. Throws if no db configured. */ startSession(): plugins.mongodb.ClientSession; } export interface IMigrationCheckpoint { read(key: string): Promise; write(key: string, value: T): Promise; clear(): Promise; } ``` ### `MigrationStepBuilder` (chained, terminal `.up()` returns parent `SmartMigration`) ```ts class MigrationStepBuilder { from(version: string): this; to(version: string): this; description(text: string): this; resumable(): this; // enables ctx.checkpoint up(handler: (ctx: IMigrationContext) => Promise): SmartMigration; } ``` ### `SmartMigration` ```ts class SmartMigration { public settings: ISmartMigrationOptions; constructor(options: ISmartMigrationOptions); /** Begin defining a step. Returns a chainable builder. */ step(id: string): MigrationStepBuilder; /** * The startup entry point. * 1. Acquires the migration lock * 2. Reads current ledger version (treats null as fresh install) * 3. Validates the chain of registered steps * 4. Computes the plan (which steps to run) * 5. Executes them sequentially, checkpointing each * 6. Releases the lock * Returns a result describing what was applied/skipped. */ run(): Promise; /** Returns the plan without executing. Useful for `--dry-run` style probes. */ plan(): Promise; /** Returns the current data version from the ledger, or null if uninitialised. */ getCurrentVersion(): Promise; } ``` ### `IMigrationRunResult` ```ts export interface IMigrationRunResult { currentVersionBefore: string | null; currentVersionAfter: string; targetVersion: string; wasUpToDate: boolean; wasFreshInstall: boolean; stepsApplied: IMigrationStepResult[]; stepsSkipped: IMigrationStepResult[]; // populated only on dry-run / when out of range totalDurationMs: number; } export interface IMigrationStepResult { id: string; fromVersion: string; toVersion: string; status: TMigrationStatus; // 'applied' | 'skipped' | 'failed' durationMs: number; startedAt: string; // ISO finishedAt: string; // ISO error?: { message: string; stack?: string }; } ``` --- ## Ledger model The ledger is the source of truth. There are two backends: ### Mongo ledger (default when `db` is provided) Backed by an `EasyStore` (smartdata's existing `EasyStore`, see `/mnt/data/lossless/push.rocks/smartdata/ts/classes.easystore.ts:35-101`). The `nameId` is `smartmigration:`. Schema of the stored data: ```ts interface ISmartMigrationLedgerData { currentVersion: string | null; steps: Record; // keyed by step id lock: { holder: string | null; // random instance UUID acquiredAt: string | null; // ISO expiresAt: string | null; // ISO }; checkpoints: Record>; // stepId -> { key: value } } ``` **Why EasyStore over a custom collection?** `EasyStore` already exists, is designed for exactly this kind of singleton-config-blob use case, handles its own collection setup lazily, and avoids polluting the user's DB with smartmigration-internal classes. The whole ledger fits in one document, which makes the lock CAS trivial. **Locking implementation.** Acquire by calling `easyStore.readAll()`, checking `lock.holder === null || lock.expiresAt < now`, then `easyStore.writeAll({ lock: { holder: instanceId, ... } })`. Re-read after writing to confirm we won the race (last-writer-wins is fine here because we re-check). Loop with backoff until `lockWaitMs` elapsed. This is admittedly not a true CAS — for v1 it's adequate; v2 can move to `findOneAndUpdate` against the underlying mongo collection if races become a problem. ### S3 ledger (default when only `bucket` is provided) A single object at `/.smartmigration/.json` containing the same `ISmartMigrationLedgerData` shape. Reads use `bucket.fastGet`, writes use `bucket.fastPut` with `overwrite: true`. Locking is **best-effort** (S3 has no CAS without conditional writes); we set `lock.expiresAt` and re-read to detect races. **Documented limitation:** S3-only deployments should not run multiple SaaS instances against the same ledger simultaneously without external coordination — when both mongo and S3 are present (the common SaaS case), the mongo ledger is used and the lock works correctly. ### Selection logic ```ts const backend = options.ledgerBackend ?? (options.db ? 'mongo' : options.bucket ? 's3' : null); if (!backend) throw new SmartMigrationError('Either db or bucket must be provided'); ``` --- ## Run algorithm ``` run(): steps = registeredSteps // array, in registration order validateStepChain(steps) // checks unique ids, no gaps in version chain acquireLock() try: ledger = readLedger() currentVersion = ledger.currentVersion if currentVersion === null: if isFreshInstall() and freshInstallVersion is set: currentVersion = freshInstallVersion writeLedger({ currentVersion }) else: currentVersion = steps[0].from // start from earliest if compareSemver(currentVersion, targetVersion) === 0: return { wasUpToDate: true, ... } plan = computePlan(steps, currentVersion, targetVersion) if plan.length === 0: throw "no migration path from X to Y" for step in plan: if dryRun: skipped.push(step); continue ctx = buildContext(step) try: await step.handler(ctx) ledger.steps[step.id] = { ...result } ledger.currentVersion = step.toVersion writeLedger(ledger) applied.push(step) catch err: ledger.steps[step.id] = { ...result, status: 'failed', error } writeLedger(ledger) throw err finally: releaseLock() ``` ### `isFreshInstall()` - Mongo: `db.mongoDb.listCollections({}, {nameOnly:true}).toArray()` → if every collection name starts with smartmigration's reserved prefix or matches the EasyStore class name, fresh. - S3: open a `bucket.createCursor('')` and ask for one batch — if empty (after excluding `.smartmigration/` prefix), fresh. ### `validateStepChain(steps)` - Each `step.id` must be unique - Each `step.from` and `step.to` must be valid semver - For consecutive steps `a, b`: `a.to === b.from` (strict equality, not semver-compare — forces explicit chains) - `compareSemver(step.from, step.to) < 0` for every step ### `computePlan(steps, current, target)` - Find the step where `from === current`. Take it and all subsequent steps until one has `to === target`. Return that slice. - If `current` doesn't match any step's `from`, throw with a clear message naming the registered version chain. - If we walk past `target` without matching, throw. --- ## File layout Following the **flat-classes pattern** used by most push.rocks modules (smartproxy's nested layout is the exception, justified only by its size). The new module's `ts/` will look like: ``` ts/ ├── 00_commitinfo_data.ts // auto-generated by commitinfo on release ├── index.ts // re-exports the public surface ├── plugins.ts // central import barrel ├── interfaces.ts // I-prefixed public interfaces ├── types.ts // T-prefixed public type aliases ├── logger.ts // module-scoped Smartlog singleton ├── classes.smartmigration.ts // SmartMigration class + SmartMigrationError ├── classes.migrationstep.ts // MigrationStepBuilder + internal MigrationStep ├── classes.migrationcontext.ts // buildContext() factory + checkpoint impl ├── classes.versionresolver.ts // semver-based plan computation + validation └── ledgers/ ├── classes.ledger.ts // abstract Ledger base ├── classes.mongoledger.ts // EasyStore-backed implementation └── classes.s3ledger.ts // bucket.fastPut/fastGet-backed implementation ``` ### `ts/plugins.ts` content ```ts // node native scope import { randomUUID } from 'node:crypto'; export { randomUUID }; // pushrocks scope import * as smartdata from '@push.rocks/smartdata'; import * as smartbucket from '@push.rocks/smartbucket'; import * as smartlog from '@push.rocks/smartlog'; import * as smartlogDestinationLocal from '@push.rocks/smartlog/destination-local'; import * as smartversion from '@push.rocks/smartversion'; import * as smarttime from '@push.rocks/smarttime'; import * as smartpromise from '@push.rocks/smartpromise'; export { smartdata, smartbucket, smartlog, smartlogDestinationLocal, smartversion, smarttime, smartpromise, }; // third-party scope (driver re-exports for type access) import type * as mongodb from 'mongodb'; import type * as awsSdk from '@aws-sdk/client-s3'; export type { mongodb, awsSdk }; ``` `smartdata` and `smartbucket` are **peerDependencies** (not direct dependencies) — users will already have one or both, and we don't want to duplicate them. Listed in `dependencies`: `@push.rocks/smartlog`, `@push.rocks/smartversion`, `@push.rocks/smarttime`, `@push.rocks/smartpromise`. Listed in `peerDependencies` (optional): `@push.rocks/smartdata`, `@push.rocks/smartbucket`. Listed in `devDependencies` for tests: both peers + `@push.rocks/smartmongo` (in-memory mongo). --- ## Project scaffolding files These mirror the smartproxy conventions, with smartproxy-specific Rust bits removed. ### `package.json` ```json { "name": "@push.rocks/smartmigration", "version": "1.0.0", "private": false, "description": "Unified migration runner for MongoDB (smartdata) and S3 (smartbucket) — designed to be invoked at SaaS app startup, with semver-based version tracking, sequential step execution, idempotent re-runs, and per-step resumable checkpoints.", "main": "dist_ts/index.js", "typings": "dist_ts/index.d.ts", "type": "module", "author": "Lossless GmbH", "license": "MIT", "scripts": { "test": "(tstest test/**/test*.ts --verbose --timeout 120 --logfile)", "build": "(tsbuild tsfolders --allowimplicitany)", "format": "(gitzone format)", "buildDocs": "tsdoc" }, "devDependencies": { "@git.zone/tsbuild": "^4.4.0", "@git.zone/tsrun": "^2.0.2", "@git.zone/tstest": "^3.6.0", "@push.rocks/smartdata": "^7.1.6", "@push.rocks/smartbucket": "^4.5.1", "@push.rocks/smartmongo": "^7.0.0", "@types/node": "^25.5.0", "typescript": "^6.0.2" }, "dependencies": { "@push.rocks/smartlog": "^3.2.1", "@push.rocks/smartpromise": "^4.2.3", "@push.rocks/smarttime": "^4.1.1", "@push.rocks/smartversion": "^3.0.5" }, "peerDependencies": { "@push.rocks/smartdata": "^7.1.6", "@push.rocks/smartbucket": "^4.5.1" }, "peerDependenciesMeta": { "@push.rocks/smartdata": { "optional": true }, "@push.rocks/smartbucket": { "optional": true } }, "files": [ "ts/**/*", "dist/**/*", "dist_*/**/*", "dist_ts/**/*", ".smartconfig.json", "readme.md", "changelog.md" ], "keywords": [ "migration", "schema migration", "data migration", "mongodb", "s3", "smartdata", "smartbucket", "saas", "startup migration", "semver", "idempotent", "ledger", "rolling deploy" ], "homepage": "https://code.foss.global/push.rocks/smartmigration#readme", "repository": { "type": "git", "url": "https://code.foss.global/push.rocks/smartmigration.git" }, "bugs": { "url": "https://code.foss.global/push.rocks/smartmigration/issues" }, "pnpm": { "overrides": {}, "onlyBuiltDependencies": ["mongodb-memory-server"] } } ``` ### `tsconfig.json` (verbatim from smartproxy) ```json { "compilerOptions": { "target": "ES2022", "module": "NodeNext", "moduleResolution": "NodeNext", "esModuleInterop": true, "verbatimModuleSyntax": true }, "exclude": ["dist_*/**/*.d.ts"] } ``` ### `.smartconfig.json` — same shape as smartproxy's, with name/scope/repo updated and the `@git.zone/tsrust` block omitted. Description and keywords from package.json above. ### `.gitignore` — verbatim from smartproxy minus the `rust/target` line. ### `license` — MIT, copy from smartproxy. ### `changelog.md` — single entry for `1.0.0 - - Initial release`. ### `readme.hints.md` — empty stub. --- ## Test plan Tests live in `test/` and follow the smartproxy/`tstest` conventions: every file ends with `export default tap.start();` and uses `expect`/`tap` from `@git.zone/tstest/tapbundle`. Use `@push.rocks/smartmongo` to spin up an in-memory mongo for tests; for S3 tests, mock against a small in-process fake (or skip and only test mongo paths in v1). Files to create: | File | What it covers | |---|---| | `test/test.basic.ts` | Constructor validates options; throws if neither db nor bucket given; default ledger backend selection | | `test/test.builder.ts` | Step builder chains correctly, `.from().to().up()` registers a step, validation catches duplicate ids and gaps | | `test/test.versionresolver.ts` | `computePlan` returns the right slice for various current/target combinations; throws on missing chain | | `test/test.mongoledger.ts` | Read/write/lock against a real mongo via `smartmongo`; lock expiry; concurrent acquire retries | | `test/test.run.mongo.ts` | End-to-end: define 3 steps, run from scratch, verify all applied; re-run is no-op; mid-step failure leaves ledger consistent | | `test/test.run.checkpoint.ts` | Resumable step that crashes mid-way; second run resumes from checkpoint | | `test/test.freshinstall.ts` | Empty db + `freshInstallVersion` set → jumps to that version, runs no steps | | `test/test.dryrun.ts` | `dryRun: true` returns plan but does not write | Critical assertions for the end-to-end mongo test (`test/test.run.mongo.ts`): ```ts const db = await smartmongo.SmartMongo.createAndStart(); const smartmig = new SmartMigration({ targetVersion: '2.0.0', db: db.smartdataDb }); const log: string[] = []; smartmig .step('a').from('1.0.0').to('1.1.0').up(async () => { log.push('a'); }) .step('b').from('1.1.0').to('1.5.0').up(async () => { log.push('b'); }) .step('c').from('1.5.0').to('2.0.0').up(async () => { log.push('c'); }); const r1 = await smartmig.run(); expect(r1.stepsApplied).toHaveLength(3); expect(log).toEqual(['a', 'b', 'c']); expect(r1.currentVersionAfter).toEqual('2.0.0'); const r2 = await smartmig.run(); expect(r2.wasUpToDate).toBeTrue(); expect(r2.stepsApplied).toHaveLength(0); expect(log).toEqual(['a', 'b', 'c']); // unchanged ``` End-to-end verification path (manual smoke after implementation): write a tiny `.nogit/debug/saas-startup.ts` script that constructs a `SmartdataDb` against a local mongo, registers two no-op migrations, runs `smartmig.run()` twice, and prints the result both times — the second run must report `wasUpToDate: true`. --- ## Critical files to create **Source (under `/mnt/data/lossless/push.rocks/smartmigration/`):** - `package.json` - `tsconfig.json` - `.smartconfig.json` - `.gitignore` - `license` - `changelog.md` - `readme.md` (full structured README following the smartproxy section template — installation, what is it, quick start, core concepts, common use cases, API reference, troubleshooting, best practices, license) - `readme.hints.md` (stub) - `ts/00_commitinfo_data.ts` - `ts/index.ts` - `ts/plugins.ts` - `ts/interfaces.ts` - `ts/types.ts` - `ts/logger.ts` - `ts/classes.smartmigration.ts` - `ts/classes.migrationstep.ts` - `ts/classes.migrationcontext.ts` - `ts/classes.versionresolver.ts` - `ts/ledgers/classes.ledger.ts` - `ts/ledgers/classes.mongoledger.ts` - `ts/ledgers/classes.s3ledger.ts` - `test/test.basic.ts` - `test/test.builder.ts` - `test/test.versionresolver.ts` - `test/test.mongoledger.ts` - `test/test.run.mongo.ts` - `test/test.run.checkpoint.ts` - `test/test.freshinstall.ts` - `test/test.dryrun.ts` **Reused from existing modules (no need to create — these are already in dependencies):** - `EasyStore` from smartdata (`/mnt/data/lossless/push.rocks/smartdata/ts/classes.easystore.ts:9-121`) — backs the mongo ledger - `SmartdataDb.mongoDb` (raw `Db`) and `SmartdataDb.mongoDbClient` (raw `MongoClient`) for `ctx.mongo` and transactional sessions - `SmartdataDb.startSession()` (`/mnt/data/lossless/push.rocks/smartdata/ts/classes.db.ts`) for `ctx.startSession()` - `Bucket.fastGet` / `Bucket.fastPut` for the S3 ledger backend - `Bucket.createCursor` (resumable token-based pagination, `/mnt/data/lossless/push.rocks/smartbucket/ts/classes.listcursor.ts`) — the canonical pattern for restartable S3 migrations, referenced in the readme example - `SmartBucket.storageClient` for `ctx.s3` - `Smartlog` from `@push.rocks/smartlog` for the module logger - `compareVersions` from `@push.rocks/smartversion` for semver ordering