push.rocks/smartmigration

Fork 0

T

jkunz 8e40882c5a v1.4.1

2026-05-19 21:57:24 +00:00

test

feat(planner): add bridge target version strategy for auto-stamping ledger to app versions

2026-05-19 21:44:57 +00:00

v1.4.1

2026-05-19 21:57:24 +00:00

.gitignore

feat(smartmigration): add initial smartmigration package with MongoDB and S3 migration runner

2026-04-07 17:35:05 +00:00

.smartconfig.json

feat(smartmigration): add initial smartmigration package with MongoDB and S3 migration runner

2026-04-07 17:35:05 +00:00

changelog.md

v1.4.1

2026-05-19 21:57:24 +00:00

license

feat(smartmigration): add initial smartmigration package with MongoDB and S3 migration runner

2026-04-07 17:35:05 +00:00

package.json

v1.4.1

2026-05-19 21:57:24 +00:00

pnpm-lock.yaml

fix(build): tighten TypeScript compiler settings and refresh package metadata and dependency versions

2026-04-30 09:48:14 +00:00

readme.hints.md

feat(migration): add lock heartbeats, predictive dry-run planning, and stricter ledger option validation

2026-04-14 12:31:34 +00:00

readme.md

feat(planner): add bridge target version strategy for auto-stamping ledger to app versions

2026-05-19 21:44:57 +00:00

readme.plan.md

docs(plan): update smartmigration status

2026-05-19 06:43:06 +00:00

tsconfig.json

fix(build): tighten TypeScript compiler settings and refresh package metadata and dependency versions

2026-04-30 09:48:14 +00:00

readme.md

@push.rocks/smartmigration

A unified migration runner for MongoDB (via @push.rocks/smartdata) and S3 (via @push.rocks/smartbucket) — designed to be invoked at SaaS app startup.

Installation

pnpm add @push.rocks/smartmigration
# plus whichever resources you need
pnpm add @push.rocks/smartdata @push.rocks/smartbucket

@push.rocks/smartdata and @push.rocks/smartbucket are declared as optional peer dependencies — install whichever ones your migrations actually touch.

Issue Reporting and Security

Report bugs and security issues at community.foss.global.

What is smartmigration?

smartmigration is the missing piece for SaaS apps in the push.rocks ecosystem: a deterministic, idempotent way to evolve persistent data in lockstep with code releases. You define a chain of migration steps with from/to semver versions, hand the runner your SmartdataDb and/or Bucket, and call run() from your app's startup path. The runner figures out which steps need to execute, runs them sequentially, and stamps progress into a ledger so subsequent boots are fast no-ops.

Key features

Feature	Description
Builder-style API	Fluent step definition: `step('id').from('1.0.0').to('1.1.0').up(handler)`
Unified mongo + S3	A single semver represents your combined data version. One step can touch both.
Drivers exposed via context	`ctx.db`, `ctx.mongo`, `ctx.bucket`, `ctx.s3` — write any operation you can write directly
Idempotent	`run()` is a no-op when data is at `targetVersion`. Costs one ledger read on the happy path.
Resumable	Mark a step `.resumable()` and it gets `ctx.checkpoint.read/write/clear` for restartable bulk operations; successful runs clear the step checkpoint automatically
Lockable	Mongo-backed lock uses atomic updates plus TTL heartbeats to serialize concurrent SaaS instances — safe for rolling deploys
Fresh-install fast path	Configure `freshInstallVersion` to skip migrations on a brand-new database
Target bridging	Optionally bridge schema/data version to the app version without hand-written no-op steps
Dry-run	`dryRun: true` or `.plan()` returns the execution plan without writing anything
Structured errors	All failures throw `SmartMigrationError` with a stable `code` field for branching

Quick start

import { SmartdataDb } from '@push.rocks/smartdata';
import { SmartBucket } from '@push.rocks/smartbucket';
import { SmartMigration } from '@push.rocks/smartmigration';
import { commitinfo } from './00_commitinfo_data.js';

// 1. set up your resources as you normally would
const db = new SmartdataDb({
  mongoDbUrl: process.env.MONGODB_URL!,
  mongoDbName: 'myapp',
});
await db.init();

const sb = new SmartBucket({
  accessKey: process.env.S3_ACCESSKEY!,
  accessSecret: process.env.S3_SECRETKEY!,
  endpoint: process.env.S3_ENDPOINT!,
  region: 'us-east-1',
});
const bucket = await sb.getBucketByName('myapp-uploads');

// 2. construct the runner with your app's target version
const migration = new SmartMigration({
  targetVersion: commitinfo.version,
  db,
  bucket,
});

// 3. register migrations in execution order, with from/to validating the chain
migration
  .step('lowercase-emails')
    .from('1.0.0').to('1.1.0')
    .description('Lowercase all user emails')
    .up(async (ctx) => {
      await ctx.mongo!.collection('users').updateMany(
        {},
        [{ $set: { email: { $toLower: '$email' } } }],
      );
    })
  .step('reorganize-uploads')
    .from('1.1.0').to('2.0.0')
    .description('Move uploads/ to media/')
    .resumable()
    .up(async (ctx) => {
      const cursor = ctx.bucket!.createCursor('uploads/');
      const startToken = await ctx.checkpoint!.read<string>('cursorToken');
      if (startToken) cursor.setToken(startToken);
      while (cursor.hasMore()) {
        const { keys } = await cursor.next();
        for (const key of keys) {
          await ctx.bucket!.fastMove({
            sourcePath: key,
            destinationPath: 'media/' + key.slice('uploads/'.length),
            overwrite: true,
          });
        }
        await ctx.checkpoint!.write('cursorToken', cursor.getToken());
      }
    });

// 4. run on startup — fast no-op once the data is at targetVersion
const result = await migration.run();
console.log(`smartmigration: ${result.currentVersionBefore ?? 'fresh'} → ${result.currentVersionAfter} (${result.stepsApplied.length} steps in ${result.totalDurationMs}ms)`);

Core concepts

The data version

A single semver string represents the combined state of your mongo and S3 data. It is not the same as your app version (though you usually want them to match): the app version is what's running, the data version is what's on disk. The runner's job is to bring the data version up to the app version.

Steps and the chain

Each migration is a step with:

A unique id (string) — used as the ledger key
A from semver — the data version this step expects to start from
A to semver — the data version this step produces
An up handler — the actual migration logic
An optional description, resumable flag

Steps execute in registration order. The runner validates that the chain is contiguous: step[N].to === step[N+1].from. This catches gaps and overlaps at definition time, before any handler runs.

Resume modes

When run() reads the ledger and finds a current version, it computes a plan: the subset of steps needed to advance from currentVersion to targetVersion. Two resume modes are supported:

Exact resume — currentVersion === step.fromVersion for some step. The normal case, where the ledger sits exactly at a step's starting point (because the previous step's to was written to the ledger when it completed).
Skip-forward resume — currentVersion > step.fromVersion but currentVersion < step.toVersion. The orphan case: the ledger was stamped to an intermediate version that no registered step starts at. This typically happens when an app configures freshInstallVersion: targetVersion across several releases that didn't add any migrations — fresh installs get stamped to whatever commitinfo.version was at install time, not to the last step's to. When a migration is finally added, those installs have a ledger value that doesn't match any step's from.

In skip-forward mode, the planner picks the first step whose toVersion > currentVersion and runs it (and all subsequent steps) normally. The step's handler is being invoked against data that may already be partially in the target shape, so step handlers must be idempotent (use $set over $inc, check existence before insert, filter-based updateMany over cursor iteration where possible). A log line at INFO level announces when a step runs in skip-forward mode.

If no step's toVersion is greater than currentVersion (the ledger is past the end of the chain), the runner throws TARGET_NOT_REACHABLE.

Target version strategies

By default, smartmigration is strict: the registered chain must contain a real step whose toVersion exactly matches targetVersion. This is best when you maintain an explicit schema/data version.

If your app uses its package version as targetVersion, releases without schema changes can leave the migration chain behind the app version. In that case, enable bridge mode:

const migration = new SmartMigration({
  targetVersion: commitinfo.version,
  db,
  freshInstallVersion: commitinfo.version,
  targetVersionStrategy: 'bridge',
});

Bridge mode appends an internal no-op step from the last registered migration version to targetVersion when the chain ends before the target. It stamps the ledger to the app version without requiring hand-written no-op migrations for every release.

Important behavior:

Existing real migrations still run first.
If the ledger is already past the last registered migration but below targetVersion, only the internal bridge runs.
Future real migrations still work because skip-forward resume runs the first real step whose toVersion is above the ledger version.
Downgrades and steps that overshoot the target still fail.

The ledger

The ledger is the source of truth for "what data version are we at, what steps have been applied, who holds the lock right now." It is persisted in one of two backends:

mongo (default when db is provided) — stored as a single document in smartdata's SmartdataEasyStore collection. Lock acquisition / renewal uses atomic mongo updates and works safely across multiple SaaS instances. Recommended.
s3 (default when only bucket is provided) — a single JSON object at <bucket>/.smartmigration/<ledgerName>.json. Lock is best-effort because S3 has no atomic CAS without additional infrastructure; do not use for multi-instance deployments without external coordination.

If you pass both db and bucket, mongo is used.

The migration context

Each up handler receives a IMigrationContext with:

interface IMigrationContext {
  db?: SmartdataDb;            // high-level smartdata
  bucket?: Bucket;             // high-level smartbucket Bucket
  mongo?: mongodb.Db;          // raw mongo Db (db.mongoDb)
  s3?: S3Client;               // raw AWS SDK v3 S3Client
  step: { id, fromVersion, toVersion, description?, isResumable };
  log: Smartlog;
  isDryRun: boolean;
  checkpoint?: IMigrationCheckpoint; // only when step is .resumable()
  startSession(): mongodb.ClientSession; // throws if no db
}

Use whichever level fits your migration:

High-level (ctx.db, ctx.bucket) for working with smartdata models / smartbucket files
Raw drivers (ctx.mongo, ctx.s3) for any operation smartdata/smartbucket don't wrap (renameCollection, dropIndex, S3 lifecycle policies, etc.)
startSession() for transactional mongo migrations

Common use cases

Mongo schema rename

migration
  .step('rename-user-email-field')
  .from('1.0.0').to('1.1.0')
  .up(async (ctx) => {
    await ctx.mongo!.collection('users').updateMany(
      { emailAddress: { $exists: true } },
      { $rename: { emailAddress: 'email' } },
    );
  });

Backfilling a derived field with a transaction

migration
  .step('backfill-account-tier')
  .from('1.5.0').to('2.0.0')
  .up(async (ctx) => {
    const session = ctx.startSession();
    try {
      await session.withTransaction(async () => {
        const users = ctx.mongo!.collection('users');
        await users.updateMany(
          { tier: { $exists: false }, plan: 'free' },
          { $set: { tier: 'free' } },
          { session },
        );
        await users.updateMany(
          { tier: { $exists: false }, plan: { $in: ['pro', 'enterprise'] } },
          { $set: { tier: 'paid' } },
          { session },
        );
      });
    } finally {
      await session.endSession();
    }
  });

Resumable S3 reprefix

migration
  .step('move-attachments')
  .from('2.0.0').to('2.1.0')
  .resumable()
  .up(async (ctx) => {
    const cursor = ctx.bucket!.createCursor('attachments/legacy/', { pageSize: 200 });
    const start = await ctx.checkpoint!.read<string>('cursor');
    if (start) cursor.setToken(start);
    while (cursor.hasMore()) {
      const { keys } = await cursor.next();
      for (const key of keys) {
        await ctx.bucket!.fastMove({
          sourcePath: key,
          destinationPath: 'attachments/' + key.slice('attachments/legacy/'.length),
          overwrite: true,
        });
      }
      await ctx.checkpoint!.write('cursor', cursor.getToken());
    }
  });

If the process crashes mid-migration, the next call to run() will resume from the last persisted cursor token. If the step completes successfully, smartmigration clears that step's checkpoint bag automatically.

Mongo + S3 in lockstep

migration
  .step('split-avatars-out-of-users')
  .from('2.1.0').to('2.5.0')
  .up(async (ctx) => {
    // 1. for every user with an inline avatarBytes field, write the bytes to S3
    const users = ctx.mongo!.collection('users');
    const cursor = users.find({ avatarBytes: { $exists: true } });
    for await (const user of cursor) {
      const buf = Buffer.from(user.avatarBytes.buffer);
      await ctx.bucket!.fastPut({
        path: `avatars/${user._id}.bin`,
        contents: buf,
        overwrite: true,
      });
      await users.updateOne(
        { _id: user._id },
        { $set: { avatarKey: `avatars/${user._id}.bin` }, $unset: { avatarBytes: '' } },
      );
    }
  });

Fresh-install fast path

const migration = new SmartMigration({
  targetVersion: '5.0.0',
  db,
  freshInstallVersion: '5.0.0', // brand-new DB jumps straight to 5.0.0
});
// ... register your full chain of steps ...
await migration.run(); // for a fresh DB, runs zero steps and stamps version 5.0.0

Dry run / planning

const planned = await migration.plan();
console.log(`would apply ${planned.stepsSkipped.length} steps:`,
  planned.stepsSkipped.map((s) => `${s.id}(${s.fromVersion}→${s.toVersion})`).join(' → '));
console.log(`predicted version after run: ${planned.currentVersionAfter}`);

// or — same thing, via dryRun option
const m = new SmartMigration({ targetVersion: '2.0.0', db, dryRun: true });
// ...
const result = await m.run(); // returns plan, doesn't write

plan() / dryRun resolve the same fresh-install shortcut that run() would use, so the returned currentVersionAfter is the predicted post-run version. For S3-backed ledgers, planning does not create the .smartmigration/<ledger>.json sidecar.

API reference

`new SmartMigration(options: ISmartMigrationOptions)`

Option	Type	Default	Description
`targetVersion`	`string`	— (required)	Semver representing the data version this code expects
`db`	`SmartdataDb`	undefined	Required if any step uses `ctx.db`/`ctx.mongo` or for the mongo ledger
`bucket`	`Bucket`	undefined	Required if any step uses `ctx.bucket`/`ctx.s3` or for the S3 ledger
`ledgerName`	`string`	`"smartmigration"`	Logical name; lets multiple migrations coexist on the same db/bucket
`ledgerBackend`	`'mongo'` \| `'s3'`	mongo if db, else s3	Where to persist the ledger
`freshInstallVersion`	`string`	undefined	When the resource is empty, jump straight to this version
`targetVersionStrategy`	`'strict'` \| `'bridge'`	`'strict'`	Require the chain to end exactly at targetVersion, or auto-bridge to it
`lockWaitMs`	`number`	`60_000`	How long to wait for a stale lock from another instance
`lockTtlMs`	`number`	`600_000`	How long this instance's own lock auto-expires after
`dryRun`	`boolean`	`false`	If true, `run()` returns the plan without executing or locking
`logger`	`Smartlog`	module logger	Custom logger; defaults to a Smartlog with a local destination

The constructor throws SmartMigrationError with one of these codes on bad input:

INVALID_OPTIONS — options object missing
MISSING_TARGET_VERSION — targetVersion not set
INVALID_VERSION — targetVersion is not a valid semver
NO_RESOURCES — neither db nor bucket provided
LEDGER_BACKEND_MISMATCH — explicit ledgerBackend doesn't match the resources you provided
INVALID_LEDGER_NAME — ledgerName is blank or not a string
INVALID_LOCK_WAIT_MS — lockWaitMs is not an integer >= 0
INVALID_LOCK_TTL_MS — lockTtlMs is not an integer >= 1
INVALID_TARGET_VERSION_STRATEGY — targetVersionStrategy is not strict or bridge

`migration.step(id: string).from(v).to(v).[description(t)].[resumable()].up(handler)`

The step builder. .from(), .to(), and .up() are required; .description() and .resumable() are optional. .up() is the terminal call: it commits the step to the parent runner and returns the runner so you can chain .step('next').

`migration.run(): Promise<IMigrationRunResult>`

The startup entry point. Acquires a lock, computes the plan, executes pending steps in order, and releases the lock. Returns:

interface IMigrationRunResult {
  currentVersionBefore: string | null;
  currentVersionAfter: string;
  targetVersion: string;
  wasUpToDate: boolean;        // true if no steps ran
  wasFreshInstall: boolean;    // true if startup took the fresh-install shortcut
  stepsApplied: IMigrationStepResult[];
  stepsSkipped: IMigrationStepResult[];
  totalDurationMs: number;
}

Throws SmartMigrationError with these codes:

LOCK_TIMEOUT — could not acquire lock within lockWaitMs
LOCK_LOST — the runner lost its lock while a migration was still in flight
STEP_FAILED — a step's handler threw; the failure is persisted to the ledger
CHAIN_*, DUPLICATE_STEP_ID, NON_INCREASING_STEP, TARGET_NOT_REACHABLE, DOWNGRADE_NOT_SUPPORTED — chain validation / planning errors

`migration.plan(): Promise<IMigrationRunResult>`

Same as run() but does not acquire the lock or execute anything. Useful for --dry-run style probes in CI. stepsSkipped are the steps that would run, and currentVersionAfter is the predicted post-run version after fresh-install resolution.

`migration.getCurrentVersion(): Promise<string | null>`

Returns the current data version from the ledger, or null if the ledger has never been initialized.

Troubleshooting

`LOCK_TIMEOUT` on every startup

Another instance crashed while holding the lock. Wait for lockTtlMs (default 10 minutes) for the lock to expire, or manually clear the lock field on the ledger document.

`LOCK_LOST` during a migration

The runner stopped renewing its lock or another instance replaced the lock document while a step was still running. Check for long-running steps, make sure lockTtlMs is comfortably larger than the expected renewal cadence, and investigate other processes touching the same ledger.

`CHAIN_GAP` at startup

Two adjacent steps have mismatched versions: step[N].to !== step[N+1].from. Steps must form a contiguous chain in registration order. Fix the version on the offending step.

`CHAIN_NOT_AT_CURRENT` (legacy)

Retained in the error vocabulary for backward compatibility with downstream consumers that previously branched on it, but no longer thrown by computePlan in normal operation. Prior versions of smartmigration required an exact fromVersion === currentVersion match when resolving the plan; the current planner supports skip-forward resume and handles intermediate-version ledger stamps transparently.

`TARGET_NOT_REACHABLE`

Either (a) a step in the plan upgrades to a version past targetVersion without any step ending exactly at targetVersion, or (b) the ledger's currentVersion is past the end of the registered chain but has not reached targetVersion. Case (a) means the chain has a mid-step that overshoots — add the missing final step or adjust targetVersion. Case (b) means the chain needs a new step extending it toward targetVersion.

S3-only deployments and concurrent instances

The S3 ledger's lock is best-effort. If you run multiple SaaS instances against the same S3 bucket, use external coordination (e.g. Redis lock, leader election) before calling run(). The mongo backend has no such limitation.

Best practices

Prefer idempotent migrations. Even with the lock, machines crash, networks partition, and migrations should be safe to re-run. Use $set over $inc, check existence before insert, etc.
Use .resumable() for any step that processes more than a few hundred records. Crashes happen; resumability avoids redoing work.
Use transactions for multi-collection mongo migrations via ctx.startSession() + session.withTransaction().
Don't mix S3 and Mongo writes inside a single conceptual operation — S3 has no transactions, so design migrations so each S3 write is observably idempotent on its own.
Keep the targetVersion linked to your app's package.json version via the autocreated 00_commitinfo_data.ts. That way every release automatically signals the data version it expects.
Freeze applied migrations in source control — don't edit a step that has been applied to production data, even if the new version is "more correct." Add a new step instead.

License and Legal Information

This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the license file within this repository.

Please note: The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.

Trademarks

This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH and are not included within the scope of the MIT license granted herein. Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines, and any usage must be approved in writing by Task Venture Capital GmbH.

readme.md

@push.rocks/smartmigration

Installation

Issue Reporting and Security

What is smartmigration?

Key features

Quick start

Core concepts

The data version

Steps and the chain

Resume modes

Target version strategies

The ledger

The migration context

Common use cases

Mongo schema rename

Backfilling a derived field with a transaction

Resumable S3 reprefix

Mongo + S3 in lockstep

Fresh-install fast path

Dry run / planning

API reference

new SmartMigration(options: ISmartMigrationOptions)

migration.step(id: string).from(v).to(v).[description(t)].[resumable()].up(handler)

migration.run(): Promise<IMigrationRunResult>

migration.plan(): Promise<IMigrationRunResult>

migration.getCurrentVersion(): Promise<string | null>

Troubleshooting

LOCK_TIMEOUT on every startup

LOCK_LOST during a migration

CHAIN_GAP at startup

CHAIN_NOT_AT_CURRENT (legacy)

TARGET_NOT_REACHABLE

S3-only deployments and concurrent instances

Best practices

License and Legal Information

Trademarks

Company Information

`new SmartMigration(options: ISmartMigrationOptions)`

`migration.step(id: string).from(v).to(v).[description(t)].[resumable()].up(handler)`

`migration.run(): Promise<IMigrationRunResult>`

`migration.plan(): Promise<IMigrationRunResult>`

`migration.getCurrentVersion(): Promise<string | null>`

`LOCK_TIMEOUT` on every startup

`LOCK_LOST` during a migration

`CHAIN_GAP` at startup

`CHAIN_NOT_AT_CURRENT` (legacy)

`TARGET_NOT_REACHABLE`