349 lines
11 KiB
Markdown
349 lines
11 KiB
Markdown
# @push.rocks/smartarchive 📦
|
|
|
|
**Powerful archive manipulation for modern Node.js applications**
|
|
|
|
`@push.rocks/smartarchive` is a versatile library for handling archive files with a focus on developer experience. Work with **zip**, **tar**, **gzip**, and **bzip2** formats through a unified, streaming-optimized API.
|
|
|
|
## Features 🚀
|
|
|
|
- 📁 **Multi-format support** - Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives
|
|
- 🌊 **Streaming-first architecture** - Process large archives without memory constraints
|
|
- 🔄 **Unified API** - Consistent interface across different archive formats
|
|
- 🎯 **Smart detection** - Automatically identifies archive types
|
|
- ⚡ **High performance** - Optimized for speed with parallel processing where possible
|
|
- 🔧 **Flexible I/O** - Work with files, URLs, and streams seamlessly
|
|
- 📊 **Archive analysis** - Inspect contents without extraction
|
|
- 🛠️ **Modern TypeScript** - Full type safety and excellent IDE support
|
|
|
|
## Installation 📥
|
|
|
|
```bash
|
|
# Using npm
|
|
npm install @push.rocks/smartarchive
|
|
|
|
# Using pnpm (recommended)
|
|
pnpm add @push.rocks/smartarchive
|
|
|
|
# Using yarn
|
|
yarn add @push.rocks/smartarchive
|
|
```
|
|
|
|
## Quick Start 🎯
|
|
|
|
### Extract an archive from URL
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Extract a .tar.gz archive from a URL directly to the filesystem
|
|
const archive = await SmartArchive.fromArchiveUrl(
|
|
'https://github.com/some/repo/archive/main.tar.gz'
|
|
);
|
|
await archive.exportToFs('./extracted');
|
|
```
|
|
|
|
### Process archive as a stream
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Stream-based processing for memory efficiency
|
|
const archive = await SmartArchive.fromArchiveFile('./large-archive.zip');
|
|
const streamOfFiles = await archive.exportToStreamOfStreamFiles();
|
|
|
|
// Process each file in the archive
|
|
streamOfFiles.on('data', (fileStream) => {
|
|
console.log(`Processing ${fileStream.path}`);
|
|
// Handle individual file stream
|
|
});
|
|
```
|
|
|
|
## Core Concepts 💡
|
|
|
|
### Archive Sources
|
|
|
|
`SmartArchive` accepts archives from three sources:
|
|
|
|
1. **URL** - Download and process archives from the web
|
|
2. **File** - Load archives from the local filesystem
|
|
3. **Stream** - Process archives from any Node.js stream
|
|
|
|
### Export Destinations
|
|
|
|
Extract archives to multiple destinations:
|
|
|
|
1. **Filesystem** - Extract directly to a directory
|
|
2. **Stream of files** - Process files individually as streams
|
|
3. **Archive stream** - Re-stream as different format
|
|
|
|
## Usage Examples 🔨
|
|
|
|
### Working with ZIP files
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Extract a ZIP file
|
|
const zipArchive = await SmartArchive.fromArchiveFile('./archive.zip');
|
|
await zipArchive.exportToFs('./output');
|
|
|
|
// Stream ZIP contents for processing
|
|
const fileStream = await zipArchive.exportToStreamOfStreamFiles();
|
|
fileStream.on('data', (file) => {
|
|
if (file.path.endsWith('.json')) {
|
|
// Process JSON files from the archive
|
|
file.pipe(jsonProcessor);
|
|
}
|
|
});
|
|
```
|
|
|
|
### Working with TAR archives
|
|
|
|
```typescript
|
|
import { SmartArchive, TarTools } from '@push.rocks/smartarchive';
|
|
|
|
// Extract a .tar.gz file
|
|
const tarGzArchive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
|
|
await tarGzArchive.exportToFs('./extracted');
|
|
|
|
// Create a TAR archive (using TarTools directly)
|
|
const tarTools = new TarTools();
|
|
const packStream = await tarTools.packDirectory('./source-directory');
|
|
packStream.pipe(createWriteStream('./output.tar'));
|
|
```
|
|
|
|
### Extracting from URLs
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Download and extract in one operation
|
|
const remoteArchive = await SmartArchive.fromArchiveUrl(
|
|
'https://example.com/data.tar.gz'
|
|
);
|
|
|
|
// Extract to filesystem
|
|
await remoteArchive.exportToFs('./local-dir');
|
|
|
|
// Or process as stream
|
|
const stream = await remoteArchive.exportToStreamOfStreamFiles();
|
|
```
|
|
|
|
### Analyzing archive contents
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Analyze without extracting
|
|
const archive = await SmartArchive.fromArchiveFile('./archive.zip');
|
|
const analyzer = archive.archiveAnalyzer;
|
|
|
|
// Use the analyzer to inspect contents
|
|
// (exact implementation depends on analyzer methods)
|
|
```
|
|
|
|
### Working with GZIP files
|
|
|
|
```typescript
|
|
import { SmartArchive, GzipTools } from '@push.rocks/smartarchive';
|
|
|
|
// Decompress a .gz file
|
|
const gzipArchive = await SmartArchive.fromArchiveFile('./data.json.gz');
|
|
await gzipArchive.exportToFs('./decompressed', 'data.json');
|
|
|
|
// Use GzipTools directly for streaming
|
|
const gzipTools = new GzipTools();
|
|
const decompressStream = gzipTools.getDecompressionStream();
|
|
|
|
createReadStream('./compressed.gz')
|
|
.pipe(decompressStream)
|
|
.pipe(createWriteStream('./decompressed'));
|
|
```
|
|
|
|
### Working with BZIP2 files
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
// Handle .bz2 files
|
|
const bzipArchive = await SmartArchive.fromArchiveUrl(
|
|
'https://example.com/data.bz2'
|
|
);
|
|
await bzipArchive.exportToFs('./extracted', 'data.txt');
|
|
```
|
|
|
|
### Advanced streaming operations
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
import { pipeline } from 'stream/promises';
|
|
|
|
// Chain operations with streams
|
|
const archive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
|
|
const exportStream = await archive.exportToStreamOfStreamFiles();
|
|
|
|
// Process each file in the archive
|
|
await pipeline(
|
|
exportStream,
|
|
async function* (source) {
|
|
for await (const file of source) {
|
|
if (file.path.endsWith('.log')) {
|
|
// Process log files
|
|
yield processLogFile(file);
|
|
}
|
|
}
|
|
},
|
|
createWriteStream('./processed-logs.txt')
|
|
);
|
|
```
|
|
|
|
### Creating archives (advanced)
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
import { TarTools } from '@push.rocks/smartarchive';
|
|
|
|
// Using SmartArchive to create an archive
|
|
const archive = new SmartArchive();
|
|
|
|
// Add content to the archive
|
|
archive.addedDirectories.push('./src');
|
|
archive.addedFiles.push('./readme.md');
|
|
archive.addedFiles.push('./package.json');
|
|
|
|
// Export as TAR.GZ
|
|
const tarGzStream = await archive.exportToTarGzStream();
|
|
tarGzStream.pipe(createWriteStream('./output.tar.gz'));
|
|
```
|
|
|
|
### Extract and transform
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
import { Transform } from 'stream';
|
|
|
|
// Extract and transform files in one pipeline
|
|
const archive = await SmartArchive.fromArchiveUrl(
|
|
'https://example.com/source-code.tar.gz'
|
|
);
|
|
|
|
const extractStream = await archive.exportToStreamOfStreamFiles();
|
|
|
|
// Transform TypeScript to JavaScript during extraction
|
|
extractStream.on('data', (fileStream) => {
|
|
if (fileStream.path.endsWith('.ts')) {
|
|
fileStream
|
|
.pipe(typescriptTranspiler())
|
|
.pipe(createWriteStream(fileStream.path.replace('.ts', '.js')));
|
|
} else {
|
|
fileStream.pipe(createWriteStream(fileStream.path));
|
|
}
|
|
});
|
|
```
|
|
|
|
## API Reference 📚
|
|
|
|
### SmartArchive Class
|
|
|
|
#### Static Methods
|
|
|
|
- `SmartArchive.fromArchiveUrl(url: string)` - Create from URL
|
|
- `SmartArchive.fromArchiveFile(path: string)` - Create from file
|
|
- `SmartArchive.fromArchiveStream(stream: NodeJS.ReadableStream)` - Create from stream
|
|
|
|
#### Instance Methods
|
|
|
|
- `exportToFs(targetDir: string, fileName?: string)` - Extract to filesystem
|
|
- `exportToStreamOfStreamFiles()` - Get a stream of file streams
|
|
- `exportToTarGzStream()` - Export as TAR.GZ stream
|
|
- `getArchiveStream()` - Get the raw archive stream
|
|
|
|
#### Properties
|
|
|
|
- `archiveAnalyzer` - Analyze archive contents
|
|
- `tarTools` - TAR-specific operations
|
|
- `zipTools` - ZIP-specific operations
|
|
- `gzipTools` - GZIP-specific operations
|
|
- `bzip2Tools` - BZIP2-specific operations
|
|
|
|
### Specialized Tools
|
|
|
|
Each tool class provides format-specific operations:
|
|
|
|
- **TarTools** - Pack/unpack TAR archives
|
|
- **ZipTools** - Handle ZIP compression
|
|
- **GzipTools** - GZIP compression/decompression
|
|
- **Bzip2Tools** - BZIP2 operations
|
|
|
|
## Performance Tips 🏎️
|
|
|
|
1. **Use streaming for large files** - Avoid loading entire archives into memory
|
|
2. **Process files in parallel** - Utilize stream operations for concurrent processing
|
|
3. **Choose the right format** - TAR.GZ for Unix systems, ZIP for cross-platform compatibility
|
|
4. **Enable compression wisely** - Balance between file size and CPU usage
|
|
|
|
## Error Handling 🛡️
|
|
|
|
```typescript
|
|
import { SmartArchive } from '@push.rocks/smartarchive';
|
|
|
|
try {
|
|
const archive = await SmartArchive.fromArchiveUrl('https://example.com/file.zip');
|
|
await archive.exportToFs('./output');
|
|
} catch (error) {
|
|
if (error.code === 'ENOENT') {
|
|
console.error('Archive file not found');
|
|
} else if (error.code === 'EACCES') {
|
|
console.error('Permission denied');
|
|
} else {
|
|
console.error('Archive extraction failed:', error.message);
|
|
}
|
|
}
|
|
```
|
|
|
|
## Real-World Use Cases 🌍
|
|
|
|
### Backup System
|
|
```typescript
|
|
// Automated backup extraction
|
|
const backup = await SmartArchive.fromArchiveFile('./backup.tar.gz');
|
|
await backup.exportToFs('/restore/location');
|
|
```
|
|
|
|
### CI/CD Pipeline
|
|
```typescript
|
|
// Download and extract build artifacts
|
|
const artifacts = await SmartArchive.fromArchiveUrl(
|
|
`${CI_SERVER}/artifacts/build-${BUILD_ID}.zip`
|
|
);
|
|
await artifacts.exportToFs('./dist');
|
|
```
|
|
|
|
### Data Processing
|
|
```typescript
|
|
// Process compressed datasets
|
|
const dataset = await SmartArchive.fromArchiveUrl(
|
|
'https://data.source/dataset.tar.bz2'
|
|
);
|
|
const files = await dataset.exportToStreamOfStreamFiles();
|
|
// Process each file in the dataset
|
|
```
|
|
|
|
## License and Legal Information
|
|
|
|
This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository.
|
|
|
|
**Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
|
|
|
|
### Trademarks
|
|
|
|
This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH and are not included within the scope of the MIT license granted herein. Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines, and any usage must be approved in writing by Task Venture Capital GmbH.
|
|
|
|
### Company Information
|
|
|
|
Task Venture Capital GmbH
|
|
Registered at District court Bremen HRB 35230 HB, Germany
|
|
|
|
For any legal inquiries or if you require further information, please contact us via email at hello@task.vc.
|
|
|
|
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works. |