6.4 KiB
6.4 KiB
Improvement Plan
Goal
Make this repository a solid product wrapper for @push.rocks/smartstorage: reliable Docker image, predictable runtime behavior, durable admin configuration, and a management surface that scales past toy datasets.
Current Position
The repository already has a clear structure and decent coverage for auth, buckets, objects, policies, credentials, status/config, lifecycle, and S3 compatibility.
The main gaps are product hardening and dependency boundaries:
- the Docker build still references
npmextra.jsoneven though the repo migrated to.smartconfig.json - some admin/runtime behavior is still in-memory only
- status and bucket summaries are computed by rescanning storage through S3 calls
- the product surface does not yet expose real cluster and drive health
- the UI error model is still mostly
console.error()+ stale state retention
Repo-Owned Work
1. Fix Packaging Regressions First
- Update
Dockerfileto stop copyingnpmextra.jsonand use the current build config layout. - Add a Docker smoke test that proves the image builds and the container starts with both ports exposed.
Reason:
- This repo is explicitly about shipping a usable container. A broken or stale Docker build path is a product failure, not a nice-to-have cleanup.
2. Make Admin Configuration Durable
- Decide which runtime-managed settings must survive restart.
- Credentials are the biggest gap today: add/remove currently mutates runtime config only.
- Align credential durability with the existing named-policy persistence model.
- Add restart-persistence tests.
Reason:
- A product container should not pretend to manage credentials if those changes disappear on restart.
3. Tighten Management API Behavior
- Fix bucket/policy cleanup ordering so deleting a bucket does not trigger follow-up work against a missing bucket.
- Return cleaner typed errors from handlers where failure is expected.
- Surface actionable errors in the UI instead of only logging and keeping stale state.
Reason:
- The current tests pass, but post-test output already shows avoidable
NoSuchBucketnoise during policy cleanup.
4. Reduce Large-Object and Large-Dataset Pain
- Keep inline editing and preview for small objects.
- Add explicit size thresholds for inline/base64 flows.
- Prefer direct object URLs or streaming paths for large-object download/preview paths.
- Stop treating expensive full-dataset scans as normal refresh behavior.
Reason:
- The product should stay usable once users have real buckets and non-trivial object counts.
5. Add Product Acceptance Coverage
- Docker build/start smoke test.
- Restart-persistence test for credentials and policies.
- Cluster-config smoke test.
- One browser-level smoke test for login, bucket browse, and config view.
Reason:
- The repo is productization code. Acceptance coverage matters more here than unit-test density.
Dependency-Owned Work Completed
The dependency boundary work landed in @push.rocks/smartstorage v6.4.0 and is now consumed here through the Deno import map.
- Runtime storage stats and bucket summaries come from
smartstorageinstead of product-side S3 scans. - Runtime credential listing/replacement uses supported
smartstorageAPIs instead of mutating engine internals. - Cluster and drive health are exposed through
smartstorageand surfaced through the management API/UI.
Suggested Execution Order
- Fix Dockerfile and add a container smoke test.
- Make credential changes durable and test restart behavior.
- Clean up bucket/policy deletion behavior and UI error reporting.
- Keep product code on supported
smartstorageruntime APIs for stats, credentials, and cluster health. - Add browser-level smoke coverage for login, bucket browse, and config views.
Success Criteria
- Docker image builds from a clean checkout.
- Runtime-managed credentials survive restart or are explicitly documented as ephemeral until fixed.
- Status and bucket views no longer require full object scans for routine refreshes.
- Bucket deletion and policy cleanup complete without noisy missing-bucket follow-up errors.
- Cluster mode exposes live health, not just configured values.
Enterprise Readiness Plan
Acceptance Criteria Implemented In-Repo
- Runtime health: expose unauthenticated
/livez,/readyz,/healthz, and/metricson the management server. - Container health: Docker
HEALTHCHECKmust use/readyzand the image must run as a non-root user. - Admin auditability: login, logout, bucket mutation, object mutation, credential mutation, and named-policy mutation must emit append-only audit events under
.objectstorage/audit.log. - Audit access: admins can query recent audit entries through a typed management API.
- Session security: JWT role is validated from verified claims, logout revokes the active token, and repeated failed logins are rate-limited.
- Secret persistence: admin metadata files are written with restrictive file mode where the OS supports it.
- Default-secret guardrail: persistent
/datadeployments refuseadmin/admindefaults unless explicitly allowed for disposable development. - Frontend session storage: admin identity is no longer stored in persistent browser state.
Dependency Work Implemented In @push.rocks/smartstorage
- Cluster identity and topology snapshots persist under
.smartstorage/cluster/. - Cluster startup refuses unsafe seed-node fallback instead of silently forming a split-brain cluster.
- Heartbeats probe all known peers, including suspect/offline peers, using configured timeout.
- Operational S3-side endpoints expose
/-/live,/-/ready,/-/health, and/-/metrics. - Runtime credential listing returns metadata only; secrets are write-only.
- Multi-node tests cover topology convergence, remote drive routing, and restart/rejoin identity durability.
Production Requirements Outside This Repo
- Run the management UI behind TLS or add native typedserver TLS configuration for the deployment.
- Configure real admin and S3 credentials through secrets management, not image defaults.
- Decide cluster transport policy: pinned CA or mTLS with operational certificate rotation. Do not operate QUIC insecure-dev transport on untrusted networks.
- Define backup/restore procedures for
/data,.objectstorage,.smartstorage/cluster, bucket manifests, and policies. - Add load, soak, and failure-injection runs to release qualification for large datasets and network partitions.