# Improvement Plan ## Goal Make this repository a solid product wrapper for `@push.rocks/smartstorage`: reliable Docker image, predictable runtime behavior, durable admin configuration, and a management surface that scales past toy datasets. ## Current Position The repository already has a clear structure and decent coverage for auth, buckets, objects, policies, credentials, status/config, lifecycle, and S3 compatibility. The main gaps are product hardening and dependency boundaries: - the Docker build still references `npmextra.json` even though the repo migrated to `.smartconfig.json` - some admin/runtime behavior is still in-memory only - status and bucket summaries are computed by rescanning storage through S3 calls - the product surface does not yet expose real cluster and drive health - the UI error model is still mostly `console.error()` + stale state retention ## Repo-Owned Work ### 1. Fix Packaging Regressions First - Update `Dockerfile` to stop copying `npmextra.json` and use the current build config layout. - Add a Docker smoke test that proves the image builds and the container starts with both ports exposed. Reason: - This repo is explicitly about shipping a usable container. A broken or stale Docker build path is a product failure, not a nice-to-have cleanup. ### 2. Make Admin Configuration Durable - Decide which runtime-managed settings must survive restart. - Credentials are the biggest gap today: add/remove currently mutates runtime config only. - Align credential durability with the existing named-policy persistence model. - Add restart-persistence tests. Reason: - A product container should not pretend to manage credentials if those changes disappear on restart. ### 3. Tighten Management API Behavior - Fix bucket/policy cleanup ordering so deleting a bucket does not trigger follow-up work against a missing bucket. - Return cleaner typed errors from handlers where failure is expected. - Surface actionable errors in the UI instead of only logging and keeping stale state. Reason: - The current tests pass, but post-test output already shows avoidable `NoSuchBucket` noise during policy cleanup. ### 4. Reduce Large-Object and Large-Dataset Pain - Keep inline editing and preview for small objects. - Add explicit size thresholds for inline/base64 flows. - Prefer direct object URLs or streaming paths for large-object download/preview paths. - Stop treating expensive full-dataset scans as normal refresh behavior. Reason: - The product should stay usable once users have real buckets and non-trivial object counts. ### 5. Add Product Acceptance Coverage - Docker build/start smoke test. - Restart-persistence test for credentials and policies. - Cluster-config smoke test. - One browser-level smoke test for login, bucket browse, and config view. Reason: - The repo is productization code. Acceptance coverage matters more here than unit-test density. ## Dependency-Owned Work Completed The dependency boundary work landed in `@push.rocks/smartstorage` `v6.4.0` and is now consumed here through the Deno import map. - Runtime storage stats and bucket summaries come from `smartstorage` instead of product-side S3 scans. - Runtime credential listing/replacement uses supported `smartstorage` APIs instead of mutating engine internals. - Cluster and drive health are exposed through `smartstorage` and surfaced through the management API/UI. ## Suggested Execution Order 1. Fix Dockerfile and add a container smoke test. 2. Make credential changes durable and test restart behavior. 3. Clean up bucket/policy deletion behavior and UI error reporting. 4. Keep product code on supported `smartstorage` runtime APIs for stats, credentials, and cluster health. 5. Add browser-level smoke coverage for login, bucket browse, and config views. ## Success Criteria - Docker image builds from a clean checkout. - Runtime-managed credentials survive restart or are explicitly documented as ephemeral until fixed. - Status and bucket views no longer require full object scans for routine refreshes. - Bucket deletion and policy cleanup complete without noisy missing-bucket follow-up errors. - Cluster mode exposes live health, not just configured values. ## Enterprise Readiness Plan ### Acceptance Criteria Implemented In-Repo - Runtime health: expose unauthenticated `/livez`, `/readyz`, `/healthz`, and `/metrics` on the management server. - Container health: Docker `HEALTHCHECK` must use `/readyz` and the image must run as a non-root user. - Admin auditability: login, logout, bucket mutation, object mutation, credential mutation, and named-policy mutation must emit append-only audit events under `.objectstorage/audit.log`. - Audit access: admins can query recent audit entries through a typed management API. - Session security: JWT role is validated from verified claims, logout revokes the active token, and repeated failed logins are rate-limited. - Secret persistence: admin metadata files are written with restrictive file mode where the OS supports it. - Default-secret guardrail: persistent `/data` deployments refuse `admin/admin` defaults unless explicitly allowed for disposable development. - Frontend session storage: admin identity is no longer stored in persistent browser state. ### Dependency Work Implemented In `@push.rocks/smartstorage` - Cluster identity and topology snapshots persist under `.smartstorage/cluster/`. - Cluster startup refuses unsafe seed-node fallback instead of silently forming a split-brain cluster. - Heartbeats probe all known peers, including suspect/offline peers, using configured timeout. - Operational S3-side endpoints expose `/-/live`, `/-/ready`, `/-/health`, and `/-/metrics`. - Runtime credential listing returns metadata only; secrets are write-only. - Multi-node tests cover topology convergence, remote drive routing, and restart/rejoin identity durability. ### Production Requirements Outside This Repo - Run the management UI behind TLS or add native typedserver TLS configuration for the deployment. - Configure real admin and S3 credentials through secrets management, not image defaults. - Decide cluster transport policy: pinned CA or mTLS with operational certificate rotation. Do not operate QUIC insecure-dev transport on untrusted networks. - Define backup/restore procedures for `/data`, `.objectstorage`, `.smartstorage/cluster`, bucket manifests, and policies. - Add load, soak, and failure-injection runs to release qualification for large datasets and network partitions.