feat(watcher): add polling-based BucketWatcher to detect add/modify/delete events and expose RxJS Observable and EventEmitter APIs

This commit is contained in:
2026-01-25 18:09:38 +00:00
parent 575cff4d09
commit 7bb994e1cb
9 changed files with 1004 additions and 76 deletions

297
readme.md
View File

@@ -1,11 +1,16 @@
# @push.rocks/smartbucket 🪣
> A powerful, cloud-agnostic TypeScript library for object storage that makes S3 feel like a modern filesystem. Built for developers who demand simplicity, type-safety, and advanced features like metadata management, file locking, intelligent trash handling, and memory-efficient streaming.
A powerful, cloud-agnostic TypeScript library for object storage that makes S3 feel like a modern filesystem. Built for developers who demand simplicity, type-safety, and advanced features like real-time bucket watching, metadata management, file locking, intelligent trash handling, and memory-efficient streaming.
## Issue Reporting and Security
For reporting bugs, issues, or security vulnerabilities, please visit [community.foss.global/](https://community.foss.global/). This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a [code.foss.global/](https://code.foss.global/) account to submit Pull Requests directly.
## Why SmartBucket? 🎯
- **🌍 Cloud Agnostic** - Write once, run on AWS S3, MinIO, DigitalOcean Spaces, Backblaze B2, Wasabi, or any S3-compatible storage
- **🌍 Cloud Agnostic** - Write once, run on AWS S3, MinIO, DigitalOcean Spaces, Backblaze B2, Wasabi, Cloudflare R2, or any S3-compatible storage
- **🚀 Modern TypeScript** - First-class TypeScript support with complete type definitions and async/await patterns
- **👀 Real-Time Watching** - Monitor bucket changes with polling-based watcher supporting RxJS and EventEmitter patterns
- **💾 Memory Efficient** - Handle millions of files with async generators, RxJS observables, and cursor pagination
- **🗑️ Smart Trash System** - Recover accidentally deleted files with built-in trash and restore functionality
- **🔒 File Locking** - Prevent concurrent modifications with built-in locking mechanisms
@@ -45,6 +50,13 @@ console.log('📄', JSON.parse(data.toString()));
for await (const key of bucket.listAllObjects('users/')) {
console.log('🔍 Found:', key);
}
// Watch for changes in real-time
const watcher = bucket.createWatcher({ prefix: 'uploads/', pollIntervalMs: 3000 });
watcher.changeSubject.subscribe((change) => {
console.log('🔔 Change detected:', change.type, change.key);
});
await watcher.start();
```
## Install 📦
@@ -65,13 +77,14 @@ npm install @push.rocks/smartbucket --save
2. [🗂️ Working with Buckets](#-working-with-buckets)
3. [📁 File Operations](#-file-operations)
4. [📋 Memory-Efficient Listing](#-memory-efficient-listing)
5. [📂 Directory Management](#-directory-management)
6. [🌊 Streaming Operations](#-streaming-operations)
7. [🔒 File Locking](#-file-locking)
8. [🏷️ Metadata Management](#-metadata-management)
9. [🗑 Trash & Recovery](#-trash--recovery)
10. [⚡ Advanced Features](#-advanced-features)
11. [☁️ Cloud Provider Support](#-cloud-provider-support)
5. [👀 Bucket Watching](#-bucket-watching)
6. [📂 Directory Management](#-directory-management)
7. [🌊 Streaming Operations](#-streaming-operations)
8. [🔒 File Locking](#-file-locking)
9. [🏷 Metadata Management](#-metadata-management)
10. [🗑️ Trash & Recovery](#-trash--recovery)
11. [⚡ Advanced Features](#-advanced-features)
12. [☁️ Cloud Provider Support](#-cloud-provider-support)
### 🏁 Getting Started
@@ -145,7 +158,7 @@ const file = await bucket.fastPut({
path: 'documents/report.pdf',
contents: Buffer.from('Your file content here')
});
console.log('✅ Uploaded:', file.path);
console.log('✅ Uploaded:', file.getBasePath());
// Upload with string content
await bucket.fastPut({
@@ -224,6 +237,14 @@ await bucket.fastMove({
destinationPath: 'final/document.txt'
});
console.log('📦 File moved');
// Copy to different bucket
const targetBucket = await smartBucket.getBucketByName('backup-bucket');
await bucket.fastCopy({
sourcePath: 'important/data.json',
destinationPath: 'archived/data.json',
targetBucket: targetBucket
});
```
### 📋 Memory-Efficient Listing
@@ -355,7 +376,7 @@ const token = cursor.getToken();
// ... later, in a different request ...
const newCursor = bucket.createCursor('uploads/', { pageSize: 100 });
newCursor.setToken(token); // Resume from saved position!
const nextPage = await cursor.next();
const nextPage = await newCursor.next();
// Reset cursor to start over
cursor.reset();
@@ -392,6 +413,109 @@ if (smallList.length < 100) {
| **Cursor** | O(pageSize) | UI pagination, resumable ops | ✅ Yes |
| **Array** | O(n) - grows with results | Small datasets (<10k items) | ❌ No |
### 👀 Bucket Watching
Monitor your S3 bucket for changes in real-time with the powerful `BucketWatcher`:
```typescript
// Create a watcher for a specific prefix
const watcher = bucket.createWatcher({
prefix: 'uploads/', // Watch files with this prefix
pollIntervalMs: 3000, // Check every 3 seconds
includeInitial: false, // Don't emit existing files on start
});
// RxJS Observable pattern (recommended for reactive apps)
watcher.changeSubject.subscribe((change) => {
if (change.type === 'add') {
console.log('📥 New file:', change.key);
} else if (change.type === 'modify') {
console.log('✏️ Modified:', change.key);
} else if (change.type === 'delete') {
console.log('🗑️ Deleted:', change.key);
}
});
// EventEmitter pattern (classic Node.js style)
watcher.on('change', (change) => {
console.log(`🔔 ${change.type}: ${change.key}`);
});
watcher.on('error', (err) => {
console.error('❌ Watcher error:', err);
});
// Start watching
await watcher.start();
// Wait until watcher is ready (initial state built)
await watcher.readyDeferred.promise;
console.log('👀 Watcher is now monitoring the bucket');
// ... your application runs ...
// Stop watching when done
await watcher.stop();
// Or use the alias:
await watcher.close();
```
#### Watcher Options
```typescript
interface IBucketWatcherOptions {
prefix?: string; // Filter objects by prefix (default: '' = all)
pollIntervalMs?: number; // Polling interval in ms (default: 5000)
bufferTimeMs?: number; // Buffer events before emitting (for batching)
includeInitial?: boolean; // Emit existing files as 'add' on start (default: false)
pageSize?: number; // Objects per page when listing (default: 1000)
}
```
#### Buffered Events
For high-frequency change environments, buffer events to reduce processing overhead:
```typescript
const watcher = bucket.createWatcher({
prefix: 'high-traffic/',
pollIntervalMs: 1000,
bufferTimeMs: 2000, // Collect events for 2 seconds before emitting
});
// Receive batched events as arrays
watcher.changeSubject.subscribe((changes) => {
if (Array.isArray(changes)) {
console.log(`📦 Batch of ${changes.length} changes:`);
changes.forEach(c => console.log(` - ${c.type}: ${c.key}`));
}
});
await watcher.start();
```
#### Change Event Structure
```typescript
interface IS3ChangeEvent {
type: 'add' | 'modify' | 'delete';
key: string; // Object key (path)
bucket: string; // Bucket name
size?: number; // File size (not present for deletes)
etag?: string; // ETag hash (not present for deletes)
lastModified?: Date; // Last modified date (not present for deletes)
}
```
#### Watch Use Cases
- 🔄 **Sync systems** - Detect changes to trigger synchronization
- 📊 **Analytics** - Track file uploads/modifications in real-time
- 🔔 **Notifications** - Alert users when their files are ready
- 🔄 **Processing pipelines** - Trigger workflows on new file uploads
- 🗄️ **Backup systems** - Detect changes for incremental backups
- 📝 **Audit logs** - Track all bucket activity
### 📂 Directory Management
SmartBucket provides powerful directory-like operations for organizing your files:
@@ -425,6 +549,10 @@ console.log('📂 Base path:', subDir.getBasePath()); // "projects/2024/"
// Create empty file as placeholder
await subDir.createEmptyFile('placeholder.txt');
// Check existence
const fileExists = await subDir.fileExists({ path: 'report.pdf' });
const dirExists = await baseDir.directoryExists('projects');
```
### 🌊 Streaming Operations
@@ -470,13 +598,10 @@ import * as fs from 'node:fs';
const readStream = fs.createReadStream('big-data.csv');
await bucket.fastPutStream({
path: 'uploads/big-data.csv',
stream: readStream,
metadata: {
contentType: 'text/csv',
userMetadata: {
uploadedBy: 'data-team',
version: '2.0'
}
readableStream: readStream,
nativeMetadata: {
'content-type': 'text/csv',
'x-custom-header': 'my-value'
}
});
console.log('✅ Large file uploaded via stream');
@@ -487,8 +612,7 @@ console.log('✅ Large file uploaded via stream');
```typescript
// Get file as ReplaySubject for reactive programming
const replaySubject = await bucket.fastGetReplaySubject({
path: 'data/sensor-readings.json',
chunkSize: 1024
path: 'data/sensor-readings.json'
});
// Multiple subscribers can consume the same data
@@ -507,8 +631,8 @@ replaySubject.subscribe({
Prevent concurrent modifications with built-in file locking:
```typescript
const file = await bucket.getBaseDirectory()
.getFile({ path: 'important-config.json' });
const baseDir = await bucket.getBaseDirectory();
const file = await baseDir.getFile({ path: 'important-config.json' });
// Lock file for 10 minutes
await file.lock({ timeoutMillis: 600000 });
@@ -521,13 +645,21 @@ try {
console.log('❌ Cannot delete locked file');
}
// Check lock status
const isLocked = await file.isLocked();
// Check lock status via metadata
const metadata = await file.getMetaData();
const isLocked = await metadata.checkLocked();
console.log(`Lock status: ${isLocked ? '🔒 Locked' : '🔓 Unlocked'}`);
// Get lock info
const lockInfo = await metadata.getLockInfo();
console.log(`Lock expires: ${new Date(lockInfo.expires)}`);
// Unlock when done
await file.unlock();
console.log('🔓 File unlocked');
// Force unlock (even if locked by another process)
await file.unlock({ force: true });
```
**Lock use cases:**
@@ -541,40 +673,50 @@ console.log('🔓 File unlocked');
Attach and manage rich metadata for your files:
```typescript
const file = await bucket.getBaseDirectory()
.getFile({ path: 'document.pdf' });
const baseDir = await bucket.getBaseDirectory();
const file = await baseDir.getFile({ path: 'document.pdf' });
// Get metadata handler
const metadata = await file.getMetaData();
// Set custom metadata
await metadata.setCustomMetaData({
// Store custom metadata (can be any JSON-serializable value)
await metadata.storeCustomMetaData({
key: 'author',
value: 'John Doe'
});
await metadata.setCustomMetaData({
key: 'department',
value: 'Engineering'
await metadata.storeCustomMetaData({
key: 'tags',
value: ['important', 'quarterly-report', '2024']
});
await metadata.setCustomMetaData({
key: 'version',
value: '1.0.0'
await metadata.storeCustomMetaData({
key: 'workflow',
value: { status: 'approved', approvedBy: 'jane@company.com' }
});
// Retrieve metadata
const author = await metadata.getCustomMetaData({ key: 'author' });
console.log(`📝 Author: ${author}`);
// Get all metadata
const allMeta = await metadata.getAllCustomMetaData();
console.log('📋 All metadata:', allMeta);
// { author: 'John Doe', department: 'Engineering', version: '1.0.0' }
// Delete metadata
await metadata.deleteCustomMetaData({ key: 'workflow' });
// Check if metadata exists
const hasMetadata = await metadata.hasMetaData();
// Check if file has any metadata
const hasMetadata = await file.hasMetaData();
console.log(`Has metadata: ${hasMetadata ? '✅' : '❌'}`);
// Get file type detection
const fileType = await metadata.getFileType({ useFileExtension: true });
console.log(`📄 MIME type: ${fileType?.mime}`);
// Get file type from magic bytes (more accurate)
const detectedType = await metadata.getFileType({ useMagicBytes: true });
console.log(`🔮 Detected type: ${detectedType?.mime}`);
// Get file size
const size = await metadata.getSizeInBytes();
console.log(`📊 Size: ${size} bytes`);
```
**Metadata use cases:**
@@ -589,8 +731,8 @@ console.log(`Has metadata: ${hasMetadata ? '✅' : '❌'}`);
SmartBucket includes an intelligent trash system for safe file deletion and recovery:
```typescript
const file = await bucket.getBaseDirectory()
.getFile({ path: 'important-data.xlsx' });
const baseDir = await bucket.getBaseDirectory();
const file = await baseDir.getFile({ path: 'important-data.xlsx' });
// Move to trash instead of permanent deletion
await file.delete({ mode: 'trash' });
@@ -607,25 +749,19 @@ const trashedFiles = await trashDir.listFiles();
console.log(`📦 ${trashedFiles.length} files in trash`);
// Restore from trash
const trashedFile = await bucket.getBaseDirectory()
.getFile({
path: 'important-data.xlsx',
getFromTrash: true
});
const trashedFile = await baseDir.getFile({
path: 'important-data.xlsx',
getFromTrash: true
});
await trashedFile.restore({ useOriginalPath: true });
console.log('♻️ File restored to original location');
// Or restore to a different location
await trashedFile.restore({
useOriginalPath: false,
restorePath: 'recovered/important-data.xlsx'
toPath: 'recovered/important-data.xlsx'
});
console.log('♻️ File restored to new location');
// Empty trash permanently
await trash.emptyTrash();
console.log('🧹 Trash emptied');
```
**Trash features:**
@@ -633,7 +769,6 @@ console.log('🧹 Trash emptied');
- 🏷️ Preserves original path in metadata
- ⏰ Tracks deletion timestamp
- 🔍 List and inspect trashed files
- 🧹 Bulk empty trash operation
### ⚡ Advanced Features
@@ -642,10 +777,10 @@ console.log('🧹 Trash emptied');
```typescript
// Get detailed file statistics
const stats = await bucket.fastStat({ path: 'document.pdf' });
console.log(`📊 Size: ${stats.size} bytes`);
console.log(`📅 Last modified: ${stats.lastModified}`);
console.log(`🏷️ ETag: ${stats.etag}`);
console.log(`🗂️ Storage class: ${stats.storageClass}`);
console.log(`📊 Size: ${stats.ContentLength} bytes`);
console.log(`📅 Last modified: ${stats.LastModified}`);
console.log(`🏷️ ETag: ${stats.ETag}`);
console.log(`🗂️ Content type: ${stats.ContentType}`);
```
#### Magic Bytes Detection
@@ -661,8 +796,8 @@ const magicBytes = await bucket.getMagicBytes({
console.log(`🔮 Magic bytes: ${magicBytes.toString('hex')}`);
// Or from a File object
const file = await bucket.getBaseDirectory()
.getFile({ path: 'image.jpg' });
const baseDir = await bucket.getBaseDirectory();
const file = await baseDir.getFile({ path: 'image.jpg' });
const magic = await file.getMagicBytes({ length: 4 });
// Check file signatures
@@ -676,8 +811,8 @@ if (magic[0] === 0xFF && magic[1] === 0xD8) {
#### JSON Data Operations
```typescript
const file = await bucket.getBaseDirectory()
.getFile({ path: 'config.json' });
const baseDir = await bucket.getBaseDirectory();
const file = await baseDir.getFile({ path: 'config.json' });
// Read JSON data
const config = await file.getJsonData();
@@ -768,6 +903,15 @@ const b2Storage = new SmartBucket({
region: 'us-west-002',
useSsl: true
});
// Cloudflare R2
const r2Storage = new SmartBucket({
accessKey: process.env.R2_ACCESS_KEY_ID,
accessSecret: process.env.R2_SECRET_ACCESS_KEY,
endpoint: `${process.env.R2_ACCOUNT_ID}.r2.cloudflarestorage.com`,
region: 'auto',
useSsl: true
});
```
### 🔧 Advanced Configuration
@@ -779,25 +923,25 @@ import { Qenv } from '@push.rocks/qenv';
const qenv = new Qenv('./', './.nogit/');
const smartBucket = new SmartBucket({
accessKey: await qenv.getEnvVarOnDemandStrict('S3_ACCESS_KEY'),
accessSecret: await qenv.getEnvVarOnDemandStrict('S3_SECRET'),
endpoint: await qenv.getEnvVarOnDemandStrict('S3_ENDPOINT'),
port: parseInt(await qenv.getEnvVarOnDemandStrict('S3_PORT')),
useSsl: await qenv.getEnvVarOnDemandStrict('S3_USE_SSL') === 'true',
region: await qenv.getEnvVarOnDemandStrict('S3_REGION')
accessKey: await qenv.getEnvVarOnDemand('S3_ACCESS_KEY'),
accessSecret: await qenv.getEnvVarOnDemand('S3_SECRET'),
endpoint: await qenv.getEnvVarOnDemand('S3_ENDPOINT'),
port: parseInt(await qenv.getEnvVarOnDemand('S3_PORT')),
useSsl: await qenv.getEnvVarOnDemand('S3_USE_SSL') === 'true',
region: await qenv.getEnvVarOnDemand('S3_REGION')
});
```
### 🧪 Testing
SmartBucket is thoroughly tested with 82 comprehensive tests covering all features:
SmartBucket is thoroughly tested with 97 comprehensive tests covering all features:
```bash
# Run all tests
pnpm test
# Run specific test file
pnpm tstest test/test.listing.node+deno.ts --verbose
pnpm tstest test/test.watcher.node.ts --verbose
# Run tests with log file
pnpm test --logfile
@@ -849,7 +993,7 @@ const content = await bucket.fastGet({ path: 'file.txt' }); // May throw!
7. **Lock files** during critical operations to prevent race conditions
8. **Use async generators** for listing large buckets to avoid memory issues
9. **Set explicit overwrite flags** to prevent accidental file overwrites
10. **Clean up resources** properly when done
10. **Use the watcher** for real-time synchronization and event-driven architectures
### 📊 Performance Tips
@@ -859,22 +1003,25 @@ const content = await bucket.fastGet({ path: 'file.txt' }); // May throw!
- **Metadata**: Cache metadata when reading frequently
- **Locking**: Keep lock durations as short as possible
- **Glob patterns**: Be specific to reduce objects scanned
- **Watching**: Use appropriate `pollIntervalMs` based on your change frequency
## License and Legal Information
This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository.
This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the [LICENSE](./LICENSE) file.
**Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
### Trademarks
This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH and are not included within the scope of the MIT license granted herein. Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines, and any usage must be approved in writing by Task Venture Capital GmbH.
This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein.
Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar.
### Company Information
Task Venture Capital GmbH
Registered at District court Bremen HRB 35230 HB, Germany
Registered at District Court Bremen HRB 35230 HB, Germany
For any legal inquiries or if you require further information, please contact us via email at hello@task.vc.
For any legal inquiries or further information, please contact us via email at hello@task.vc.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.