fix(plugins): Migrate filesystem usage to Node fs/fsPromises and upgrade smartfile to v13; add listFileTree helper and update tests

This commit is contained in:
2025-11-25 11:59:11 +00:00
parent 7ccd210c45
commit 634c204a00
13 changed files with 7423 additions and 248 deletions

411
readme.md
View File

@@ -1,29 +1,32 @@
# @push.rocks/smartarchive 📦
**Powerful archive manipulation for modern Node.js applications**
Powerful archive manipulation for modern Node.js applications.
`@push.rocks/smartarchive` is a versatile library for handling archive files with a focus on developer experience. Work with **zip**, **tar**, **gzip**, and **bzip2** formats through a unified, streaming-optimized API.
## Issue Reporting and Security
For reporting bugs, issues, or security vulnerabilities, please visit [community.foss.global/](https://community.foss.global/). This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a [code.foss.global/](https://code.foss.global/) account to submit Pull Requests directly.
## Features 🚀
- 📁 **Multi-format support** - Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives
- 🌊 **Streaming-first architecture** - Process large archives without memory constraints
- 🔄 **Unified API** - Consistent interface across different archive formats
- 🎯 **Smart detection** - Automatically identifies archive types
-**High performance** - Optimized for speed with parallel processing where possible
- 🔧 **Flexible I/O** - Work with files, URLs, and streams seamlessly
- 📊 **Archive analysis** - Inspect contents without extraction
- 🛠️ **Modern TypeScript** - Full type safety and excellent IDE support
- 📁 **Multi-format support** Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives
- 🌊 **Streaming-first architecture** Process large archives without memory constraints
- 🔄 **Unified API** Consistent interface across different archive formats
- 🎯 **Smart detection** Automatically identifies archive types via magic bytes
-**High performance** Built on `tar-stream` and `fflate` for speed
- 🔧 **Flexible I/O** Work with files, URLs, and streams seamlessly
- 🛠️ **Modern TypeScript** Full type safety and excellent IDE support
## Installation 📥
```bash
# Using npm
npm install @push.rocks/smartarchive
# Using pnpm (recommended)
pnpm add @push.rocks/smartarchive
# Using npm
npm install @push.rocks/smartarchive
# Using yarn
yarn add @push.rocks/smartarchive
```
@@ -37,7 +40,7 @@ import { SmartArchive } from '@push.rocks/smartarchive';
// Extract a .tar.gz archive from a URL directly to the filesystem
const archive = await SmartArchive.fromArchiveUrl(
'https://github.com/some/repo/archive/main.tar.gz'
'https://registry.npmjs.org/some-package/-/some-package-1.0.0.tgz'
);
await archive.exportToFs('./extracted');
```
@@ -52,10 +55,15 @@ const archive = await SmartArchive.fromArchiveFile('./large-archive.zip');
const streamOfFiles = await archive.exportToStreamOfStreamFiles();
// Process each file in the archive
streamOfFiles.on('data', (fileStream) => {
console.log(`Processing ${fileStream.path}`);
streamOfFiles.on('data', async (streamFile) => {
console.log(`Processing ${streamFile.relativeFilePath}`);
const readStream = await streamFile.createReadStream();
// Handle individual file stream
});
streamOfFiles.on('end', () => {
console.log('Extraction complete');
});
```
## Core Concepts 💡
@@ -64,17 +72,18 @@ streamOfFiles.on('data', (fileStream) => {
`SmartArchive` accepts archives from three sources:
1. **URL** - Download and process archives from the web
2. **File** - Load archives from the local filesystem
3. **Stream** - Process archives from any Node.js stream
| Source | Method | Use Case |
|--------|--------|----------|
| **URL** | `SmartArchive.fromArchiveUrl(url)` | Download and process archives from the web |
| **File** | `SmartArchive.fromArchiveFile(path)` | Load archives from the local filesystem |
| **Stream** | `SmartArchive.fromArchiveStream(stream)` | Process archives from any Node.js stream |
### Export Destinations
Extract archives to multiple destinations:
1. **Filesystem** - Extract directly to a directory
2. **Stream of files** - Process files individually as streams
3. **Archive stream** - Re-stream as different format
| Destination | Method | Use Case |
|-------------|--------|----------|
| **Filesystem** | `exportToFs(targetDir, fileName?)` | Extract directly to a directory |
| **Stream of files** | `exportToStreamOfStreamFiles()` | Process files individually as `StreamFile` objects |
## Usage Examples 🔨
@@ -89,10 +98,11 @@ await zipArchive.exportToFs('./output');
// Stream ZIP contents for processing
const fileStream = await zipArchive.exportToStreamOfStreamFiles();
fileStream.on('data', (file) => {
if (file.path.endsWith('.json')) {
fileStream.on('data', async (streamFile) => {
if (streamFile.relativeFilePath.endsWith('.json')) {
const readStream = await streamFile.createReadStream();
// Process JSON files from the archive
file.pipe(jsonProcessor);
}
});
```
@@ -106,10 +116,38 @@ import { SmartArchive, TarTools } from '@push.rocks/smartarchive';
const tarGzArchive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
await tarGzArchive.exportToFs('./extracted');
// Create a TAR archive (using TarTools directly)
// Create a TAR archive using TarTools directly
const tarTools = new TarTools();
const packStream = await tarTools.packDirectory('./source-directory');
packStream.pipe(createWriteStream('./output.tar'));
const pack = await tarTools.getPackStream();
// Add files to the pack
await tarTools.addFileToPack(pack, {
fileName: 'hello.txt',
content: 'Hello, World!'
});
await tarTools.addFileToPack(pack, {
fileName: 'data.json',
content: Buffer.from(JSON.stringify({ foo: 'bar' }))
});
// Finalize and pipe to destination
pack.finalize();
pack.pipe(createWriteStream('./output.tar'));
```
### Pack a directory into TAR
```typescript
import { TarTools } from '@push.rocks/smartarchive';
import { createWriteStream } from 'fs';
const tarTools = new TarTools();
// Pack an entire directory
const pack = await tarTools.packDirectory('./src');
pack.finalize();
pack.pipe(createWriteStream('./source.tar'));
```
### Extracting from URLs
@@ -117,47 +155,36 @@ packStream.pipe(createWriteStream('./output.tar'));
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
// Download and extract in one operation
const remoteArchive = await SmartArchive.fromArchiveUrl(
'https://example.com/data.tar.gz'
// Download and extract npm packages
const npmPackage = await SmartArchive.fromArchiveUrl(
'https://registry.npmjs.org/@push.rocks/smartfile/-/smartfile-11.2.7.tgz'
);
await npmPackage.exportToFs('./node_modules/@push.rocks/smartfile');
// Extract to filesystem
await remoteArchive.exportToFs('./local-dir');
// Or process as stream
const stream = await remoteArchive.exportToStreamOfStreamFiles();
```
### Analyzing archive contents
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
// Analyze without extracting
const archive = await SmartArchive.fromArchiveFile('./archive.zip');
const analyzer = archive.archiveAnalyzer;
// Use the analyzer to inspect contents
// (exact implementation depends on analyzer methods)
// Or process as stream for memory efficiency
const stream = await npmPackage.exportToStreamOfStreamFiles();
stream.on('data', async (file) => {
console.log(`Extracted: ${file.relativeFilePath}`);
});
```
### Working with GZIP files
```typescript
import { SmartArchive, GzipTools } from '@push.rocks/smartarchive';
import { createReadStream, createWriteStream } from 'fs';
// Decompress a .gz file
// Decompress a .gz file - provide filename since gzip doesn't store it
const gzipArchive = await SmartArchive.fromArchiveFile('./data.json.gz');
await gzipArchive.exportToFs('./decompressed', 'data.json');
// Use GzipTools directly for streaming
// Use GzipTools directly for streaming decompression
const gzipTools = new GzipTools();
const decompressStream = gzipTools.getDecompressionStream();
createReadStream('./compressed.gz')
.pipe(decompressStream)
.pipe(createWriteStream('./decompressed'));
.pipe(createWriteStream('./decompressed.txt'));
```
### Working with BZIP2 files
@@ -172,115 +199,175 @@ const bzipArchive = await SmartArchive.fromArchiveUrl(
await bzipArchive.exportToFs('./extracted', 'data.txt');
```
### Advanced streaming operations
### In-memory processing (no filesystem)
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
import { pipeline } from 'stream/promises';
import { Readable } from 'stream';
// Chain operations with streams
const archive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
const exportStream = await archive.exportToStreamOfStreamFiles();
// Process archives entirely in memory
const compressedBuffer = await fetchCompressedData();
const memoryStream = Readable.from(compressedBuffer);
// Process each file in the archive
await pipeline(
exportStream,
async function* (source) {
for await (const file of source) {
if (file.path.endsWith('.log')) {
// Process log files
yield processLogFile(file);
}
}
},
createWriteStream('./processed-logs.txt')
);
```
const archive = await SmartArchive.fromArchiveStream(memoryStream);
const streamFiles = await archive.exportToStreamOfStreamFiles();
### Creating archives (advanced)
const extractedFiles: Array<{ name: string; content: Buffer }> = [];
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
import { TarTools } from '@push.rocks/smartarchive';
streamFiles.on('data', async (streamFile) => {
const chunks: Buffer[] = [];
const readStream = await streamFile.createReadStream();
// Using SmartArchive to create an archive
const archive = new SmartArchive();
// Add content to the archive
archive.addedDirectories.push('./src');
archive.addedFiles.push('./readme.md');
archive.addedFiles.push('./package.json');
// Export as TAR.GZ
const tarGzStream = await archive.exportToTarGzStream();
tarGzStream.pipe(createWriteStream('./output.tar.gz'));
```
### Extract and transform
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
import { Transform } from 'stream';
// Extract and transform files in one pipeline
const archive = await SmartArchive.fromArchiveUrl(
'https://example.com/source-code.tar.gz'
);
const extractStream = await archive.exportToStreamOfStreamFiles();
// Transform TypeScript to JavaScript during extraction
extractStream.on('data', (fileStream) => {
if (fileStream.path.endsWith('.ts')) {
fileStream
.pipe(typescriptTranspiler())
.pipe(createWriteStream(fileStream.path.replace('.ts', '.js')));
} else {
fileStream.pipe(createWriteStream(fileStream.path));
for await (const chunk of readStream) {
chunks.push(chunk);
}
extractedFiles.push({
name: streamFile.relativeFilePath,
content: Buffer.concat(chunks)
});
});
await new Promise((resolve) => streamFiles.on('end', resolve));
console.log(`Extracted ${extractedFiles.length} files in memory`);
```
### Nested archive handling (e.g., .tar.gz)
The library automatically handles nested compression. A `.tar.gz` file is:
1. First decompressed from gzip
2. Then unpacked from tar
This happens transparently:
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
// Automatically handles gzip → tar extraction chain
const tgzArchive = await SmartArchive.fromArchiveFile('./package.tar.gz');
await tgzArchive.exportToFs('./extracted');
```
## API Reference 📚
### SmartArchive Class
#### Static Methods
The main entry point for archive operations.
- `SmartArchive.fromArchiveUrl(url: string)` - Create from URL
- `SmartArchive.fromArchiveFile(path: string)` - Create from file
- `SmartArchive.fromArchiveStream(stream: NodeJS.ReadableStream)` - Create from stream
#### Static Factory Methods
```typescript
// Create from URL - downloads and processes archive
SmartArchive.fromArchiveUrl(url: string): Promise<SmartArchive>
// Create from local file path
SmartArchive.fromArchiveFile(path: string): Promise<SmartArchive>
// Create from any Node.js readable stream
SmartArchive.fromArchiveStream(stream: Readable | Duplex | Transform): Promise<SmartArchive>
```
#### Instance Methods
- `exportToFs(targetDir: string, fileName?: string)` - Extract to filesystem
- `exportToStreamOfStreamFiles()` - Get a stream of file streams
- `exportToTarGzStream()` - Export as TAR.GZ stream
- `getArchiveStream()` - Get the raw archive stream
```typescript
// Extract all files to a directory
// fileName is optional - used for single-file archives (like .gz) that don't store filename
exportToFs(targetDir: string, fileName?: string): Promise<void>
#### Properties
// Get a stream that emits StreamFile objects for each file in the archive
exportToStreamOfStreamFiles(): Promise<StreamIntake<StreamFile>>
- `archiveAnalyzer` - Analyze archive contents
- `tarTools` - TAR-specific operations
- `zipTools` - ZIP-specific operations
- `gzipTools` - GZIP-specific operations
- `bzip2Tools` - BZIP2-specific operations
// Get the raw archive stream (useful for piping)
getArchiveStream(): Promise<Readable>
```
### Specialized Tools
#### Instance Properties
Each tool class provides format-specific operations:
```typescript
archive.tarTools // TarTools instance for TAR-specific operations
archive.zipTools // ZipTools instance for ZIP-specific operations
archive.gzipTools // GzipTools instance for GZIP-specific operations
archive.bzip2Tools // Bzip2Tools instance for BZIP2-specific operations
archive.archiveAnalyzer // ArchiveAnalyzer for inspecting archive type
```
- **TarTools** - Pack/unpack TAR archives
- **ZipTools** - Handle ZIP compression
- **GzipTools** - GZIP compression/decompression
- **Bzip2Tools** - BZIP2 operations
### TarTools Class
TAR-specific operations for creating and extracting TAR archives.
```typescript
import { TarTools } from '@push.rocks/smartarchive';
const tarTools = new TarTools();
// Get a tar pack stream for creating archives
const pack = await tarTools.getPackStream();
// Add files to a pack stream
await tarTools.addFileToPack(pack, {
fileName: 'file.txt', // Name in archive
content: 'Hello World', // String, Buffer, Readable, SmartFile, or StreamFile
byteLength?: number, // Optional: specify size for streams
filePath?: string // Optional: path to file on disk
});
// Pack an entire directory
const pack = await tarTools.packDirectory('./src');
// Get extraction stream
const extract = tarTools.getDecompressionStream();
```
### ZipTools Class
ZIP-specific operations.
```typescript
import { ZipTools } from '@push.rocks/smartarchive';
const zipTools = new ZipTools();
// Get compression stream (for creating ZIP)
const compressor = zipTools.getCompressionStream();
// Get decompression stream (for extracting ZIP)
const decompressor = zipTools.getDecompressionStream();
```
### GzipTools Class
GZIP compression/decompression streams.
```typescript
import { GzipTools } from '@push.rocks/smartarchive';
const gzipTools = new GzipTools();
// Get compression stream
const compressor = gzipTools.getCompressionStream();
// Get decompression stream
const decompressor = gzipTools.getDecompressionStream();
```
## Supported Formats 📋
| Format | Extension(s) | Extract | Create |
|--------|--------------|---------|--------|
| TAR | `.tar` | ✅ | ✅ |
| TAR.GZ / TGZ | `.tar.gz`, `.tgz` | ✅ | ⚠️ |
| ZIP | `.zip` | ✅ | ⚠️ |
| GZIP | `.gz` | ✅ | ✅ |
| BZIP2 | `.bz2` | ✅ | ❌ |
✅ Full support | ⚠️ Partial/basic support | ❌ Not supported
## Performance Tips 🏎️
1. **Use streaming for large files** - Avoid loading entire archives into memory
2. **Process files in parallel** - Utilize stream operations for concurrent processing
3. **Choose the right format** - TAR.GZ for Unix systems, ZIP for cross-platform compatibility
4. **Enable compression wisely** - Balance between file size and CPU usage
1. **Use streaming for large files** Avoid loading entire archives into memory with `exportToStreamOfStreamFiles()`
2. **Provide byte lengths when known** When adding streams to TAR, provide `byteLength` for better performance
3. **Process files as they stream** Don't collect all files into an array unless necessary
4. **Choose the right format** TAR.GZ for Unix/compression, ZIP for cross-platform compatibility
## Error Handling 🛡️
@@ -295,6 +382,8 @@ try {
console.error('Archive file not found');
} else if (error.code === 'EACCES') {
console.error('Permission denied');
} else if (error.message.includes('fetch')) {
console.error('Network error downloading archive');
} else {
console.error('Archive extraction failed:', error.message);
}
@@ -303,35 +392,57 @@ try {
## Real-World Use Cases 🌍
### Backup System
```typescript
// Automated backup extraction
const backup = await SmartArchive.fromArchiveFile('./backup.tar.gz');
await backup.exportToFs('/restore/location');
```
### CI/CD: Download & Extract Build Artifacts
### CI/CD Pipeline
```typescript
// Download and extract build artifacts
const artifacts = await SmartArchive.fromArchiveUrl(
`${CI_SERVER}/artifacts/build-${BUILD_ID}.zip`
);
await artifacts.exportToFs('./dist');
```
### Data Processing
### Backup System: Restore from Archive
```typescript
// Process compressed datasets
const dataset = await SmartArchive.fromArchiveUrl(
'https://data.source/dataset.tar.bz2'
const backup = await SmartArchive.fromArchiveFile('./backup-2024.tar.gz');
await backup.exportToFs('/restore/location');
```
### NPM Package Inspection
```typescript
const pkg = await SmartArchive.fromArchiveUrl(
'https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz'
);
const files = await pkg.exportToStreamOfStreamFiles();
files.on('data', async (file) => {
if (file.relativeFilePath.includes('package.json')) {
const stream = await file.createReadStream();
// Read and analyze package.json
}
});
```
### Data Pipeline: Process Compressed Datasets
```typescript
const dataset = await SmartArchive.fromArchiveUrl(
'https://data.source/dataset.tar.gz'
);
const files = await dataset.exportToStreamOfStreamFiles();
// Process each file in the dataset
files.on('data', async (file) => {
if (file.relativeFilePath.endsWith('.csv')) {
const stream = await file.createReadStream();
// Stream CSV processing
}
});
```
## License and Legal Information
This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository.
This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository.
**Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
@@ -341,9 +452,9 @@ This project is owned and maintained by Task Venture Capital GmbH. The names and
### Company Information
Task Venture Capital GmbH
Task Venture Capital GmbH
Registered at District court Bremen HRB 35230 HB, Germany
For any legal inquiries or if you require further information, please contact us via email at hello@task.vc.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.