Files
smartstorage/readme.md

599 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# @push.rocks/smartstorage
A high-performance, S3-compatible storage server powered by a **Rust core** with a clean TypeScript API. Runs standalone for dev/test — or scales out as a **distributed, erasure-coded cluster** with QUIC-based inter-node communication. No cloud, no Docker. Just `npm install` and go. 🚀
## Issue Reporting and Security
For reporting bugs, issues, or security vulnerabilities, please visit [community.foss.global/](https://community.foss.global/). This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a [code.foss.global/](https://code.foss.global/) account to submit Pull Requests directly.
## Why smartstorage?
| Feature | smartstorage | MinIO | s3rver |
|---------|-------------|-------|--------|
| Install | `pnpm add` | Docker / binary | `npm install` |
| Startup time | ~20ms | seconds | ~200ms |
| Large file uploads | Streaming, zero-copy | Yes | OOM risk |
| Range requests | Seek-based | Yes | Full read |
| Language | Rust + TypeScript | Go | JavaScript |
| Multipart uploads | ✅ Full support | Yes | No |
| Auth | AWS SigV4 (full verification) | Full IAM | Basic |
| Bucket policies | IAM-style evaluation | Yes | No |
| Clustering | ✅ Erasure-coded, QUIC | Yes | No |
| Multi-drive awareness | ✅ Per-drive health | Yes | No |
### Core Features
- 🦀 **Rust-powered HTTP server** — hyper 1.x with streaming I/O, zero-copy, backpressure
- 📦 **Full S3-compatible API** — works with AWS SDK v3, SmartBucket, any S3 client
- 💾 **Filesystem-backed storage** — buckets map to directories, objects to files
- 📤 **Streaming multipart uploads** — large files without memory pressure
- 📐 **Byte-range requests**`seek()` directly to the requested byte offset
- 🔐 **AWS SigV4 authentication** — full signature verification with constant-time comparison
- 📋 **Bucket policies** — IAM-style JSON policies with Allow/Deny evaluation and wildcard matching
- 🌐 **CORS middleware** — configurable cross-origin support
- 🧹 **Clean slate mode** — wipe storage on startup for test isolation
-**Test-first design** — start/stop in milliseconds, no port conflicts
### Clustering Features
- 🔗 **Erasure coding** — Reed-Solomon (configurable k data + m parity shards) for storage efficiency and fault tolerance
- 🚄 **QUIC transport** — multiplexed, encrypted inter-node communication via `quinn` with zero head-of-line blocking
- 💽 **Multi-drive awareness** — each node manages multiple independent storage paths with health monitoring
- 🤝 **Cluster membership** — static seed config + runtime join, heartbeat-based failure detection
- ✍️ **Quorum writes** — data is only acknowledged after k+1 shards are persisted
- 📖 **Quorum reads** — reconstruct from any k available shards, local-first fast path
- 🩹 **Self-healing** — background scanner detects and reconstructs missing/corrupt shards
## Installation
```bash
pnpm add @push.rocks/smartstorage -D
```
> **Note:** The package ships with precompiled Rust binaries for `linux_amd64` and `linux_arm64`. No Rust toolchain needed on your machine.
## Quick Start
### Standalone Mode (Dev & Test)
```typescript
import { SmartStorage } from '@push.rocks/smartstorage';
// Start a local S3-compatible storage server
const storage = await SmartStorage.createAndStart({
server: { port: 3000 },
storage: { cleanSlate: true },
});
// Create a bucket
await storage.createBucket('my-bucket');
// Get connection details for any S3 client
const descriptor = await storage.getStorageDescriptor();
// → { endpoint: 'localhost', port: 3000, accessKey: 'STORAGE', accessSecret: 'STORAGE', useSsl: false }
// When done
await storage.stop();
```
### Cluster Mode (Distributed)
```typescript
import { SmartStorage } from '@push.rocks/smartstorage';
const storage = await SmartStorage.createAndStart({
server: { port: 3000 },
cluster: {
enabled: true,
nodeId: 'node-1',
quicPort: 4000,
seedNodes: ['192.168.1.11:4000', '192.168.1.12:4000'],
erasure: {
dataShards: 4, // k: minimum shards to reconstruct data
parityShards: 2, // m: fault tolerance (can lose up to m shards)
},
drives: {
paths: ['/mnt/disk1', '/mnt/disk2', '/mnt/disk3'],
},
},
});
```
Objects are automatically split into chunks (default 4 MB), erasure-coded into 6 shards (4 data + 2 parity), and distributed across drives/nodes. Any 4 of 6 shards can reconstruct the original data.
## Configuration
All config fields are optional — sensible defaults are applied automatically.
```typescript
import { SmartStorage, ISmartStorageConfig } from '@push.rocks/smartstorage';
const config: ISmartStorageConfig = {
server: {
port: 3000, // Default: 3000
address: '0.0.0.0', // Default: '0.0.0.0'
silent: false, // Default: false
region: 'us-east-1', // Default: 'us-east-1' — used for SigV4 signing
},
storage: {
directory: './my-data', // Default: .nogit/bucketsDir
cleanSlate: false, // Default: false — set true to wipe on start
},
auth: {
enabled: false, // Default: false
credentials: [{
accessKeyId: 'MY_KEY',
secretAccessKey: 'MY_SECRET',
}],
},
cors: {
enabled: false, // Default: false
allowedOrigins: ['*'],
allowedMethods: ['GET', 'POST', 'PUT', 'DELETE', 'HEAD', 'OPTIONS'],
allowedHeaders: ['*'],
exposedHeaders: ['ETag', 'x-amz-request-id', 'x-amz-version-id'],
maxAge: 86400,
allowCredentials: false,
},
logging: {
level: 'info', // 'error' | 'warn' | 'info' | 'debug'
format: 'text', // 'text' | 'json'
enabled: true,
},
limits: {
maxObjectSize: 5 * 1024 * 1024 * 1024, // 5 GB
maxMetadataSize: 2048,
requestTimeout: 300000, // 5 minutes
},
multipart: {
expirationDays: 7,
cleanupIntervalMinutes: 60,
},
cluster: { // Optional — omit for standalone mode
enabled: true,
nodeId: 'node-1', // Auto-generated UUID if omitted
quicPort: 4000, // Default: 4000
seedNodes: [], // Addresses of existing cluster members
erasure: {
dataShards: 4, // Default: 4
parityShards: 2, // Default: 2
chunkSizeBytes: 4194304, // Default: 4 MB
},
drives: {
paths: ['/mnt/disk1', '/mnt/disk2'],
},
heartbeatIntervalMs: 5000, // Default: 5000
heartbeatTimeoutMs: 30000, // Default: 30000
},
};
const storage = await SmartStorage.createAndStart(config);
```
### Common Configurations
**CI/CD testing** — silent, clean, fast:
```typescript
const storage = await SmartStorage.createAndStart({
server: { port: 9999, silent: true },
storage: { cleanSlate: true },
});
```
**Auth enabled:**
```typescript
const storage = await SmartStorage.createAndStart({
auth: {
enabled: true,
credentials: [{ accessKeyId: 'test', secretAccessKey: 'test123' }],
},
});
```
**CORS for local web dev:**
```typescript
const storage = await SmartStorage.createAndStart({
cors: {
enabled: true,
allowedOrigins: ['http://localhost:5173'],
allowCredentials: true,
},
});
```
## Usage with AWS SDK v3
```typescript
import { S3Client, PutObjectCommand, GetObjectCommand, DeleteObjectCommand } from '@aws-sdk/client-s3';
const descriptor = await storage.getStorageDescriptor();
const client = new S3Client({
endpoint: `http://${descriptor.endpoint}:${descriptor.port}`,
region: 'us-east-1',
credentials: {
accessKeyId: descriptor.accessKey,
secretAccessKey: descriptor.accessSecret,
},
forcePathStyle: true, // Required for path-style access
});
// Upload
await client.send(new PutObjectCommand({
Bucket: 'my-bucket',
Key: 'hello.txt',
Body: 'Hello, Storage!',
ContentType: 'text/plain',
}));
// Download
const { Body } = await client.send(new GetObjectCommand({
Bucket: 'my-bucket',
Key: 'hello.txt',
}));
const content = await Body.transformToString(); // "Hello, Storage!"
// Delete
await client.send(new DeleteObjectCommand({
Bucket: 'my-bucket',
Key: 'hello.txt',
}));
```
## Usage with SmartBucket
```typescript
import { SmartBucket } from '@push.rocks/smartbucket';
const smartbucket = new SmartBucket(await storage.getStorageDescriptor());
const bucket = await smartbucket.createBucket('my-bucket');
const dir = await bucket.getBaseDirectory();
// Upload
await dir.fastPut({ path: 'docs/readme.txt', contents: 'Hello!' });
// Download
const content = await dir.fastGet('docs/readme.txt');
// List
const files = await dir.listFiles();
```
## Multipart Uploads
For files larger than 5 MB, use multipart uploads. smartstorage handles them with **streaming I/O** — parts are written directly to disk, never buffered in memory. In cluster mode, each part is independently erasure-coded and distributed.
```typescript
import {
CreateMultipartUploadCommand,
UploadPartCommand,
CompleteMultipartUploadCommand,
} from '@aws-sdk/client-s3';
// 1. Initiate
const { UploadId } = await client.send(new CreateMultipartUploadCommand({
Bucket: 'my-bucket',
Key: 'large-file.bin',
}));
// 2. Upload parts
const parts = [];
for (let i = 0; i < chunks.length; i++) {
const { ETag } = await client.send(new UploadPartCommand({
Bucket: 'my-bucket',
Key: 'large-file.bin',
UploadId,
PartNumber: i + 1,
Body: chunks[i],
}));
parts.push({ PartNumber: i + 1, ETag });
}
// 3. Complete
await client.send(new CompleteMultipartUploadCommand({
Bucket: 'my-bucket',
Key: 'large-file.bin',
UploadId,
MultipartUpload: { Parts: parts },
}));
```
## Bucket Policies
smartstorage supports AWS-style bucket policies for fine-grained access control. Policies use the same IAM JSON format as real S3 — so you can develop and test your policy logic locally before deploying.
When `auth.enabled` is `true`, the auth pipeline works as follows:
1. **Authenticate** — verify the AWS SigV4 signature (anonymous requests skip this step)
2. **Authorize** — evaluate bucket policies against the request action, resource, and caller identity
3. **Default** — authenticated users get full access; anonymous requests are denied unless a policy explicitly allows them
### Setting a Bucket Policy
```typescript
import { PutBucketPolicyCommand } from '@aws-sdk/client-s3';
// Allow anonymous read access to all objects in a bucket
await client.send(new PutBucketPolicyCommand({
Bucket: 'public-assets',
Policy: JSON.stringify({
Version: '2012-10-17',
Statement: [{
Sid: 'PublicRead',
Effect: 'Allow',
Principal: '*',
Action: ['s3:GetObject'],
Resource: ['arn:aws:s3:::public-assets/*'],
}],
}),
}));
```
### Policy Features
- **Effect**: `Allow` and `Deny` (explicit Deny always wins)
- **Principal**: `"*"` (everyone) or `{ "AWS": ["arn:..."] }` for specific identities
- **Action**: IAM-style actions like `s3:GetObject`, `s3:PutObject`, `s3:*`, or prefix wildcards like `s3:Get*`
- **Resource**: ARN patterns with `*` and `?` wildcards (e.g. `arn:aws:s3:::my-bucket/*`)
- **Persistence**: Policies survive server restarts — stored as JSON on disk alongside your data
### Policy CRUD Operations
| Operation | AWS SDK Command | HTTP |
|-----------|----------------|------|
| Get policy | `GetBucketPolicyCommand` | `GET /{bucket}?policy` |
| Set policy | `PutBucketPolicyCommand` | `PUT /{bucket}?policy` |
| Delete policy | `DeleteBucketPolicyCommand` | `DELETE /{bucket}?policy` |
Deleting a bucket automatically removes its associated policy.
## Clustering Deep Dive 🔗
smartstorage can run as a distributed storage cluster where multiple nodes cooperate to store and retrieve data with built-in redundancy.
### How It Works
```
Client ──HTTP PUT──▶ Node A (coordinator)
├─ Split object into 4 MB chunks
├─ Erasure-code each chunk (4 data + 2 parity = 6 shards)
├──QUIC──▶ Node B (shard writes)
├──QUIC──▶ Node C (shard writes)
└─ Local disk (shard writes)
```
1. **Any node can coordinate** — the client connects to any cluster member
2. **Objects are chunked** — large objects split into fixed-size pieces (default 4 MB)
3. **Each chunk is erasure-coded** — Reed-Solomon produces k data + m parity shards
4. **Shards are distributed** — placed across different nodes and drives for fault isolation
5. **Quorum guarantees consistency** — writes need k+1 acks, reads need k shards
### Erasure Coding
With the default `4+2` configuration:
- Storage overhead: **33%** (vs. 200% for 3x replication)
- Fault tolerance: **any 2 drives/nodes can fail** simultaneously
- Read efficiency: only **4 of 6 shards** needed to reconstruct data
| Config | Total Shards | Overhead | Tolerance | Min Nodes |
|--------|-------------|----------|-----------|-----------|
| 4+2 | 6 | 33% | 2 failures | 3 |
| 6+3 | 9 | 50% | 3 failures | 5 |
| 2+1 | 3 | 50% | 1 failure | 2 |
### QUIC Transport
Inter-node communication uses [QUIC](https://en.wikipedia.org/wiki/QUIC) via the `quinn` library:
- 🔒 **Built-in TLS** — self-signed certs auto-generated at cluster init
- 🔀 **Multiplexed streams** — concurrent shard transfers without head-of-line blocking
-**Connection pooling** — persistent connections to peer nodes
- 🌊 **Natural backpressure** — QUIC flow control prevents overloading slow peers
### Cluster Membership
- **Static seed nodes** — initial cluster defined in config
- **Runtime join** — new nodes can join a running cluster
- **Heartbeat monitoring** — every 5s (configurable), with suspect/offline detection
- **Split-brain prevention** — nodes only mark peers offline when they have majority
### Self-Healing
A background scanner periodically (default: every 24h):
1. Checks shard checksums (CRC32C) for bit-rot detection
2. Identifies shards on offline nodes
3. Reconstructs missing shards from remaining data using Reed-Solomon
4. Places healed shards on healthy drives
Healing runs at low priority to avoid impacting foreground I/O.
### Erasure Set Formation
Drives are organized into fixed **erasure sets** at cluster initialization:
```
3 nodes × 4 drives each = 12 drives total
With 6-shard erasure sets → 2 erasure sets
Set 0: Node1-Disk0, Node2-Disk0, Node3-Disk0, Node1-Disk1, Node2-Disk1, Node3-Disk1
Set 1: Node1-Disk2, Node2-Disk2, Node3-Disk2, Node1-Disk3, Node2-Disk3, Node3-Disk3
```
Drives are interleaved across nodes for maximum fault isolation. New nodes form new erasure sets — existing data is never rebalanced.
## Testing Integration
```typescript
import { SmartStorage } from '@push.rocks/smartstorage';
import { tap, expect } from '@git.zone/tstest/tapbundle';
let storage: SmartStorage;
tap.test('setup', async () => {
storage = await SmartStorage.createAndStart({
server: { port: 4567, silent: true },
storage: { cleanSlate: true },
});
});
tap.test('should store and retrieve objects', async () => {
await storage.createBucket('test');
// ... your test logic using AWS SDK or SmartBucket
});
tap.test('teardown', async () => {
await storage.stop();
});
export default tap.start();
```
## API Reference
### `SmartStorage` Class
#### `static createAndStart(config?: ISmartStorageConfig): Promise<SmartStorage>`
Create and start a server in one call.
#### `start(): Promise<void>`
Spawn the Rust binary and start the HTTP server.
#### `stop(): Promise<void>`
Gracefully stop the server and kill the Rust process.
#### `createBucket(name: string): Promise<{ name: string }>`
Create a storage bucket.
#### `getStorageDescriptor(options?): Promise<IS3Descriptor>`
Get connection details for S3-compatible clients. Returns:
| Field | Type | Description |
|-------|------|-------------|
| `endpoint` | `string` | Server hostname (`localhost` by default) |
| `port` | `number` | Server port |
| `accessKey` | `string` | Access key from first configured credential |
| `accessSecret` | `string` | Secret key from first configured credential |
| `useSsl` | `boolean` | Always `false` (plain HTTP) |
## Architecture
smartstorage uses a **hybrid Rust + TypeScript** architecture:
```
┌──────────────────────────────────────────────┐
│ Your Code (AWS SDK, SmartBucket, etc.) │
│ ↕ HTTP (localhost:3000) │
├──────────────────────────────────────────────┤
│ ruststorage binary (Rust) │
│ ├─ hyper 1.x HTTP server │
│ ├─ S3 path-style routing │
│ ├─ StorageBackend (Standalone or Clustered) │
│ │ ├─ FileStore (single-node mode) │
│ │ └─ DistributedStore (cluster mode) │
│ │ ├─ ErasureCoder (Reed-Solomon) │
│ │ ├─ ShardStore (per-drive storage) │
│ │ ├─ QuicTransport (quinn) │
│ │ ├─ ClusterState & Membership │
│ │ └─ HealingService │
│ ├─ SigV4 auth + policy engine │
│ ├─ CORS middleware │
│ └─ S3 XML response builder │
├──────────────────────────────────────────────┤
│ TypeScript (thin IPC wrapper) │
│ ├─ SmartStorage class │
│ ├─ RustBridge (stdin/stdout JSON IPC) │
│ └─ Config & S3 descriptor │
└──────────────────────────────────────────────┘
```
**Why Rust?** The original TypeScript implementation had critical perf issues: OOM on multipart uploads (parts buffered in memory), double stream copying, file descriptor leaks on HEAD requests, full-file reads for range requests, and no backpressure. The Rust binary solves all of these with streaming I/O, zero-copy, and direct `seek()` for range requests.
**IPC Protocol:** TypeScript spawns the `ruststorage` binary with `--management` and communicates via newline-delimited JSON over stdin/stdout. Commands: `start`, `stop`, `createBucket`, `clusterStatus`.
### S3-Compatible Operations
| Operation | Method | Path |
|-----------|--------|------|
| ListBuckets | `GET /` | |
| CreateBucket | `PUT /{bucket}` | |
| DeleteBucket | `DELETE /{bucket}` | |
| HeadBucket | `HEAD /{bucket}` | |
| ListObjects (v1/v2) | `GET /{bucket}` | `?list-type=2` for v2 |
| PutObject | `PUT /{bucket}/{key}` | |
| GetObject | `GET /{bucket}/{key}` | Supports `Range` header |
| HeadObject | `HEAD /{bucket}/{key}` | |
| DeleteObject | `DELETE /{bucket}/{key}` | |
| CopyObject | `PUT /{bucket}/{key}` | `x-amz-copy-source` header |
| InitiateMultipartUpload | `POST /{bucket}/{key}?uploads` | |
| UploadPart | `PUT /{bucket}/{key}?partNumber&uploadId` | |
| CompleteMultipartUpload | `POST /{bucket}/{key}?uploadId` | |
| AbortMultipartUpload | `DELETE /{bucket}/{key}?uploadId` | |
| ListMultipartUploads | `GET /{bucket}?uploads` | |
| GetBucketPolicy | `GET /{bucket}?policy` | |
| PutBucketPolicy | `PUT /{bucket}?policy` | |
| DeleteBucketPolicy | `DELETE /{bucket}?policy` | |
### On-Disk Format
**Standalone mode:**
```
{storage.directory}/
{bucket}/
{key}._storage_object # Object data
{key}._storage_object.metadata.json # Metadata (content-type, x-amz-meta-*, etc.)
{key}._storage_object.md5 # Cached MD5 hash
.multipart/
{upload-id}/
metadata.json # Upload metadata
part-1, part-2, ... # Part data files
.policies/
{bucket}.policy.json # Bucket policy (IAM JSON format)
```
**Cluster mode:**
```
{drive_path}/.smartstorage/
format.json # Drive metadata (cluster ID, erasure set)
data/{bucket}/{key_hash}/{key}/
chunk-{N}/shard-{M}.dat # Erasure-coded shard data
chunk-{N}/shard-{M}.meta # Shard metadata (checksum, size)
{storage.directory}/
.manifests/{bucket}/
{key}.manifest.json # Object manifest (shard placements, checksums)
.buckets/{bucket}/ # Bucket metadata
.policies/{bucket}.policy.json # Bucket policies
```
## Related Packages
- [`@push.rocks/smartbucket`](https://code.foss.global/push.rocks/smartbucket) — High-level S3-compatible abstraction layer
- [`@push.rocks/smartrust`](https://code.foss.global/push.rocks/smartrust) — TypeScript ↔ Rust IPC bridge
- [`@git.zone/tsrust`](https://code.foss.global/git.zone/tsrust) — Rust cross-compilation for npm packages
## License and Legal Information
This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the [LICENSE](./LICENSE) file.
**Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
### Trademarks
This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein.
Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar.
### Company Information
Task Venture Capital GmbH
Registered at District Court Bremen HRB 35230 HB, Germany
For any legal inquiries or further information, please contact us via email at hello@task.vc.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.