feat(classes.smartarchive): Support URL streams, recursive archive unpacking and filesystem export; improve ZIP/GZIP/BZIP2 robustness; CI and package metadata updates
This commit is contained in:
489
readme.md
489
readme.md
@@ -1,266 +1,333 @@
|
||||
# @push.rocks/smartarchive
|
||||
# @push.rocks/smartarchive 📦
|
||||
|
||||
`@push.rocks/smartarchive` is a powerful library designed for managing archive files. It provides utilities for compressing and decompressing data in various formats such as zip, tar, gzip, and bzip2. This library aims to simplify the process of handling archive files, making it an ideal choice for projects that require manipulation of archived data.
|
||||
**Powerful archive manipulation for modern Node.js applications**
|
||||
|
||||
## Install
|
||||
`@push.rocks/smartarchive` is a versatile library for handling archive files with a focus on developer experience. Work with **zip**, **tar**, **gzip**, and **bzip2** formats through a unified, streaming-optimized API.
|
||||
|
||||
To install `@push.rocks/smartarchive`, you can either use npm or yarn. Run one of the following commands in your project directory:
|
||||
## Features 🚀
|
||||
|
||||
```shell
|
||||
npm install @push.rocks/smartarchive --save
|
||||
```
|
||||
- 📁 **Multi-format support** - Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives
|
||||
- 🌊 **Streaming-first architecture** - Process large archives without memory constraints
|
||||
- 🔄 **Unified API** - Consistent interface across different archive formats
|
||||
- 🎯 **Smart detection** - Automatically identifies archive types
|
||||
- ⚡ **High performance** - Optimized for speed with parallel processing where possible
|
||||
- 🔧 **Flexible I/O** - Work with files, URLs, and streams seamlessly
|
||||
- 📊 **Archive analysis** - Inspect contents without extraction
|
||||
- 🛠️ **Modern TypeScript** - Full type safety and excellent IDE support
|
||||
|
||||
or if you prefer yarn:
|
||||
## Installation 📥
|
||||
|
||||
```shell
|
||||
```bash
|
||||
# Using npm
|
||||
npm install @push.rocks/smartarchive
|
||||
|
||||
# Using pnpm (recommended)
|
||||
pnpm add @push.rocks/smartarchive
|
||||
|
||||
# Using yarn
|
||||
yarn add @push.rocks/smartarchive
|
||||
```
|
||||
|
||||
This will add `@push.rocks/smartarchive` to your project's dependencies.
|
||||
## Quick Start 🎯
|
||||
|
||||
## Usage
|
||||
`@push.rocks/smartarchive` provides an easy-to-use API for extracting, creating, and analyzing archive files. Below, we'll cover how to get started and explore various features of the module.
|
||||
|
||||
### Importing SmartArchive
|
||||
|
||||
First, import `SmartArchive` from `@push.rocks/smartarchive` using ESM syntax:
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
```
|
||||
|
||||
### Extracting Archive Files
|
||||
|
||||
You can extract archive files from different sources using `SmartArchive.fromArchiveUrl`, `SmartArchive.fromArchiveFile`, and `SmartArchive.fromArchiveStream`. Here's an example of extracting an archive from a URL:
|
||||
### Extract an archive from URL
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function extractArchiveFromURL() {
|
||||
const url = 'https://example.com/archive.zip';
|
||||
const targetDir = '/path/to/extract';
|
||||
|
||||
const archive = await SmartArchive.fromArchiveUrl(url);
|
||||
await archive.exportToFs(targetDir);
|
||||
|
||||
console.log('Archive extracted successfully.');
|
||||
}
|
||||
|
||||
extractArchiveFromURL();
|
||||
// Extract a .tar.gz archive from a URL directly to the filesystem
|
||||
const archive = await SmartArchive.fromArchiveUrl(
|
||||
'https://github.com/some/repo/archive/main.tar.gz'
|
||||
);
|
||||
await archive.exportToFs('./extracted');
|
||||
```
|
||||
|
||||
### Extracting an Archive from a File
|
||||
|
||||
Similarly, you can extract an archive from a local file:
|
||||
### Process archive as a stream
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function extractArchiveFromFile() {
|
||||
const filePath = '/path/to/archive.zip';
|
||||
const targetDir = '/path/to/extract';
|
||||
// Stream-based processing for memory efficiency
|
||||
const archive = await SmartArchive.fromArchiveFile('./large-archive.zip');
|
||||
const streamOfFiles = await archive.exportToStreamOfStreamFiles();
|
||||
|
||||
const archive = await SmartArchive.fromArchiveFile(filePath);
|
||||
await archive.exportToFs(targetDir);
|
||||
|
||||
console.log('Archive extracted successfully.');
|
||||
}
|
||||
|
||||
extractArchiveFromFile();
|
||||
// Process each file in the archive
|
||||
streamOfFiles.on('data', (fileStream) => {
|
||||
console.log(`Processing ${fileStream.path}`);
|
||||
// Handle individual file stream
|
||||
});
|
||||
```
|
||||
|
||||
### Stream-Based Extraction
|
||||
## Core Concepts 💡
|
||||
|
||||
For larger files, you might prefer a streaming approach to prevent high memory consumption. Here’s an example:
|
||||
### Archive Sources
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
import { createReadStream } from 'fs';
|
||||
`SmartArchive` accepts archives from three sources:
|
||||
|
||||
async function extractArchiveUsingStream() {
|
||||
const archiveStream = createReadStream('/path/to/archive.zip');
|
||||
const archive = await SmartArchive.fromArchiveStream(archiveStream);
|
||||
const extractionStream = await archive.exportToStreamOfStreamFiles();
|
||||
|
||||
extractionStream.pipe(createWriteStream('/path/to/destination'));
|
||||
}
|
||||
1. **URL** - Download and process archives from the web
|
||||
2. **File** - Load archives from the local filesystem
|
||||
3. **Stream** - Process archives from any Node.js stream
|
||||
|
||||
extractArchiveUsingStream();
|
||||
```
|
||||
### Export Destinations
|
||||
|
||||
### Analyzing Archive Files
|
||||
Extract archives to multiple destinations:
|
||||
|
||||
Sometimes, you may need to inspect the contents of an archive before extracting it. The following example shows how to analyze an archive:
|
||||
1. **Filesystem** - Extract directly to a directory
|
||||
2. **Stream of files** - Process files individually as streams
|
||||
3. **Archive stream** - Re-stream as different format
|
||||
|
||||
## Usage Examples 🔨
|
||||
|
||||
### Working with ZIP files
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function analyzeArchive() {
|
||||
const filePath = '/path/to/archive.zip';
|
||||
|
||||
const archive = await SmartArchive.fromArchiveFile(filePath);
|
||||
const analysisResult = await archive.analyzeContent();
|
||||
|
||||
console.log(analysisResult); // Outputs details about the archive content
|
||||
}
|
||||
// Extract a ZIP file
|
||||
const zipArchive = await SmartArchive.fromArchiveFile('./archive.zip');
|
||||
await zipArchive.exportToFs('./output');
|
||||
|
||||
analyzeArchive();
|
||||
// Stream ZIP contents for processing
|
||||
const fileStream = await zipArchive.exportToStreamOfStreamFiles();
|
||||
fileStream.on('data', (file) => {
|
||||
if (file.path.endsWith('.json')) {
|
||||
// Process JSON files from the archive
|
||||
file.pipe(jsonProcessor);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Creating Archive Files
|
||||
|
||||
Creating an archive file is straightforward. Here we demonstrate creating a tar.gz archive:
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function createTarGzArchive() {
|
||||
const archive = new SmartArchive();
|
||||
|
||||
// Add directories and files
|
||||
archive.addedDirectories.push('/path/to/directory1');
|
||||
archive.addedFiles.push('/path/to/file1.txt');
|
||||
|
||||
// Export as tar.gz
|
||||
const tarGzStream = await archive.exportToTarGzStream();
|
||||
|
||||
// Save to filesystem or handle as needed
|
||||
tarGzStream.pipe(createWriteStream('/path/to/destination.tar.gz'));
|
||||
}
|
||||
|
||||
createTarGzArchive();
|
||||
```
|
||||
|
||||
### Stream Operations
|
||||
|
||||
Here's an example of using `smartarchive`'s streaming capabilities:
|
||||
|
||||
```typescript
|
||||
import { createReadStream, createWriteStream } from 'fs';
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function extractArchiveUsingStreams() {
|
||||
const archiveStream = createReadStream('/path/to/archive.zip');
|
||||
const archive = await SmartArchive.fromArchiveStream(archiveStream);
|
||||
const extractionStream = await archive.exportToStreamOfStreamFiles();
|
||||
|
||||
extractionStream.pipe(createWriteStream('/path/to/extracted'));
|
||||
}
|
||||
|
||||
extractArchiveUsingStreams();
|
||||
```
|
||||
|
||||
### Advanced Decompression Usage
|
||||
|
||||
`smartarchive` supports multiple compression formats. It also provides detailed control over the decompression processes:
|
||||
|
||||
- For ZIP files, `ZipTools` handles decompression using the `fflate` library.
|
||||
- For TAR files, `TarTools` uses `tar-stream`.
|
||||
- For GZIP files, `GzipTools` provides a `CompressGunzipTransform` and `DecompressGunzipTransform`.
|
||||
- For BZIP2 files, `Bzip2Tools` utilizes custom streaming decompression.
|
||||
|
||||
Example: Working with a GZIP-compressed archive:
|
||||
|
||||
```typescript
|
||||
import { createReadStream, createWriteStream } from 'fs';
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function decompressGzipArchive() {
|
||||
const filePath = '/path/to/archive.gz';
|
||||
const targetDir = '/path/to/extract';
|
||||
|
||||
const archive = await SmartArchive.fromArchiveFile(filePath);
|
||||
await archive.exportToFs(targetDir);
|
||||
|
||||
console.log('GZIP archive decompressed successfully.');
|
||||
}
|
||||
|
||||
decompressGzipArchive();
|
||||
```
|
||||
|
||||
### Advancing with Custom Decompression Streams
|
||||
|
||||
You can inject custom decompression streams where needed:
|
||||
|
||||
```typescript
|
||||
import { createReadStream, createWriteStream } from 'fs';
|
||||
import { SmartArchive, GzipTools } from '@push.rocks/smartarchive';
|
||||
|
||||
async function customDecompression() {
|
||||
const filePath = '/path/to/archive.gz';
|
||||
const targetDir = '/path/to/extract';
|
||||
|
||||
const archive = await SmartArchive.fromArchiveFile(filePath);
|
||||
const gzipTools = new GzipTools();
|
||||
const decompressionStream = gzipTools.getDecompressionStream();
|
||||
|
||||
const archiveStream = await archive.getArchiveStream();
|
||||
archiveStream.pipe(decompressionStream).pipe(createWriteStream(targetDir));
|
||||
|
||||
console.log('Custom GZIP decompression successful.');
|
||||
}
|
||||
|
||||
customDecompression();
|
||||
```
|
||||
|
||||
### Custom Pack and Unpack Tar
|
||||
|
||||
When dealing with tar archives, you may need to perform custom packing and unpacking:
|
||||
### Working with TAR archives
|
||||
|
||||
```typescript
|
||||
import { SmartArchive, TarTools } from '@push.rocks/smartarchive';
|
||||
import { createWriteStream } from 'fs';
|
||||
|
||||
async function customTarOperations() {
|
||||
const tarTools = new TarTools();
|
||||
// Extract a .tar.gz file
|
||||
const tarGzArchive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
|
||||
await tarGzArchive.exportToFs('./extracted');
|
||||
|
||||
// Packing a directory into a tar stream
|
||||
const packStream = await tarTools.packDirectory('/path/to/directory');
|
||||
packStream.pipe(createWriteStream('/path/to/archive.tar'));
|
||||
|
||||
// Extracting files from a tar stream
|
||||
const extractStream = tarTools.getDecompressionStream();
|
||||
createReadStream('/path/to/archive.tar').pipe(extractStream).on('entry', (header, stream, next) => {
|
||||
const writeStream = createWriteStream(`/path/to/extract/${header.name}`);
|
||||
stream.pipe(writeStream);
|
||||
stream.on('end', next);
|
||||
});
|
||||
}
|
||||
|
||||
customTarOperations();
|
||||
// Create a TAR archive (using TarTools directly)
|
||||
const tarTools = new TarTools();
|
||||
const packStream = await tarTools.packDirectory('./source-directory');
|
||||
packStream.pipe(createWriteStream('./output.tar'));
|
||||
```
|
||||
|
||||
### Extract and Analyze All-in-One
|
||||
|
||||
To extract and simultaneously analyze archive content:
|
||||
### Extracting from URLs
|
||||
|
||||
```typescript
|
||||
import { createReadStream, createWriteStream } from 'fs';
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
async function extractAndAnalyze() {
|
||||
const filePath = '/path/to/archive.zip';
|
||||
const targetDir = '/path/to/extract';
|
||||
// Download and extract in one operation
|
||||
const remoteArchive = await SmartArchive.fromArchiveUrl(
|
||||
'https://example.com/data.tar.gz'
|
||||
);
|
||||
|
||||
const archive = await SmartArchive.fromArchiveFile(filePath);
|
||||
const analyzedStream = archive.archiveAnalyzer.getAnalyzedStream();
|
||||
const extractionStream = await archive.exportToStreamOfStreamFiles();
|
||||
// Extract to filesystem
|
||||
await remoteArchive.exportToFs('./local-dir');
|
||||
|
||||
analyzedStream.pipe(extractionStream).pipe(createWriteStream(targetDir));
|
||||
|
||||
analyzedStream.on('data', (chunk) => {
|
||||
console.log(JSON.stringify(chunk, null, 2));
|
||||
});
|
||||
}
|
||||
|
||||
extractAndAnalyze();
|
||||
// Or process as stream
|
||||
const stream = await remoteArchive.exportToStreamOfStreamFiles();
|
||||
```
|
||||
|
||||
### Final Words
|
||||
### Analyzing archive contents
|
||||
|
||||
These examples demonstrate various use cases for `@push.rocks/smartarchive`. Depending on your specific project requirements, you can adapt these examples to suit your needs. Always refer to the latest documentation for the most current information and methods available in `@push.rocks/smartarchive`.
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
For more information and API references, check the official [`@push.rocks/smartarchive` GitHub repository](https://code.foss.global/push.rocks/smartarchive).
|
||||
// Analyze without extracting
|
||||
const archive = await SmartArchive.fromArchiveFile('./archive.zip');
|
||||
const analyzer = archive.archiveAnalyzer;
|
||||
|
||||
// Use the analyzer to inspect contents
|
||||
// (exact implementation depends on analyzer methods)
|
||||
```
|
||||
|
||||
### Working with GZIP files
|
||||
|
||||
```typescript
|
||||
import { SmartArchive, GzipTools } from '@push.rocks/smartarchive';
|
||||
|
||||
// Decompress a .gz file
|
||||
const gzipArchive = await SmartArchive.fromArchiveFile('./data.json.gz');
|
||||
await gzipArchive.exportToFs('./decompressed', 'data.json');
|
||||
|
||||
// Use GzipTools directly for streaming
|
||||
const gzipTools = new GzipTools();
|
||||
const decompressStream = gzipTools.getDecompressionStream();
|
||||
|
||||
createReadStream('./compressed.gz')
|
||||
.pipe(decompressStream)
|
||||
.pipe(createWriteStream('./decompressed'));
|
||||
```
|
||||
|
||||
### Working with BZIP2 files
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
// Handle .bz2 files
|
||||
const bzipArchive = await SmartArchive.fromArchiveUrl(
|
||||
'https://example.com/data.bz2'
|
||||
);
|
||||
await bzipArchive.exportToFs('./extracted', 'data.txt');
|
||||
```
|
||||
|
||||
### Advanced streaming operations
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
import { pipeline } from 'stream/promises';
|
||||
|
||||
// Chain operations with streams
|
||||
const archive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
|
||||
const exportStream = await archive.exportToStreamOfStreamFiles();
|
||||
|
||||
// Process each file in the archive
|
||||
await pipeline(
|
||||
exportStream,
|
||||
async function* (source) {
|
||||
for await (const file of source) {
|
||||
if (file.path.endsWith('.log')) {
|
||||
// Process log files
|
||||
yield processLogFile(file);
|
||||
}
|
||||
}
|
||||
},
|
||||
createWriteStream('./processed-logs.txt')
|
||||
);
|
||||
```
|
||||
|
||||
### Creating archives (advanced)
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
import { TarTools } from '@push.rocks/smartarchive';
|
||||
|
||||
// Using SmartArchive to create an archive
|
||||
const archive = new SmartArchive();
|
||||
|
||||
// Add content to the archive
|
||||
archive.addedDirectories.push('./src');
|
||||
archive.addedFiles.push('./readme.md');
|
||||
archive.addedFiles.push('./package.json');
|
||||
|
||||
// Export as TAR.GZ
|
||||
const tarGzStream = await archive.exportToTarGzStream();
|
||||
tarGzStream.pipe(createWriteStream('./output.tar.gz'));
|
||||
```
|
||||
|
||||
### Extract and transform
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
import { Transform } from 'stream';
|
||||
|
||||
// Extract and transform files in one pipeline
|
||||
const archive = await SmartArchive.fromArchiveUrl(
|
||||
'https://example.com/source-code.tar.gz'
|
||||
);
|
||||
|
||||
const extractStream = await archive.exportToStreamOfStreamFiles();
|
||||
|
||||
// Transform TypeScript to JavaScript during extraction
|
||||
extractStream.on('data', (fileStream) => {
|
||||
if (fileStream.path.endsWith('.ts')) {
|
||||
fileStream
|
||||
.pipe(typescriptTranspiler())
|
||||
.pipe(createWriteStream(fileStream.path.replace('.ts', '.js')));
|
||||
} else {
|
||||
fileStream.pipe(createWriteStream(fileStream.path));
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## API Reference 📚
|
||||
|
||||
### SmartArchive Class
|
||||
|
||||
#### Static Methods
|
||||
|
||||
- `SmartArchive.fromArchiveUrl(url: string)` - Create from URL
|
||||
- `SmartArchive.fromArchiveFile(path: string)` - Create from file
|
||||
- `SmartArchive.fromArchiveStream(stream: NodeJS.ReadableStream)` - Create from stream
|
||||
|
||||
#### Instance Methods
|
||||
|
||||
- `exportToFs(targetDir: string, fileName?: string)` - Extract to filesystem
|
||||
- `exportToStreamOfStreamFiles()` - Get a stream of file streams
|
||||
- `exportToTarGzStream()` - Export as TAR.GZ stream
|
||||
- `getArchiveStream()` - Get the raw archive stream
|
||||
|
||||
#### Properties
|
||||
|
||||
- `archiveAnalyzer` - Analyze archive contents
|
||||
- `tarTools` - TAR-specific operations
|
||||
- `zipTools` - ZIP-specific operations
|
||||
- `gzipTools` - GZIP-specific operations
|
||||
- `bzip2Tools` - BZIP2-specific operations
|
||||
|
||||
### Specialized Tools
|
||||
|
||||
Each tool class provides format-specific operations:
|
||||
|
||||
- **TarTools** - Pack/unpack TAR archives
|
||||
- **ZipTools** - Handle ZIP compression
|
||||
- **GzipTools** - GZIP compression/decompression
|
||||
- **Bzip2Tools** - BZIP2 operations
|
||||
|
||||
## Performance Tips 🏎️
|
||||
|
||||
1. **Use streaming for large files** - Avoid loading entire archives into memory
|
||||
2. **Process files in parallel** - Utilize stream operations for concurrent processing
|
||||
3. **Choose the right format** - TAR.GZ for Unix systems, ZIP for cross-platform compatibility
|
||||
4. **Enable compression wisely** - Balance between file size and CPU usage
|
||||
|
||||
## Error Handling 🛡️
|
||||
|
||||
```typescript
|
||||
import { SmartArchive } from '@push.rocks/smartarchive';
|
||||
|
||||
try {
|
||||
const archive = await SmartArchive.fromArchiveUrl('https://example.com/file.zip');
|
||||
await archive.exportToFs('./output');
|
||||
} catch (error) {
|
||||
if (error.code === 'ENOENT') {
|
||||
console.error('Archive file not found');
|
||||
} else if (error.code === 'EACCES') {
|
||||
console.error('Permission denied');
|
||||
} else {
|
||||
console.error('Archive extraction failed:', error.message);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Real-World Use Cases 🌍
|
||||
|
||||
### Backup System
|
||||
```typescript
|
||||
// Automated backup extraction
|
||||
const backup = await SmartArchive.fromArchiveFile('./backup.tar.gz');
|
||||
await backup.exportToFs('/restore/location');
|
||||
```
|
||||
|
||||
### CI/CD Pipeline
|
||||
```typescript
|
||||
// Download and extract build artifacts
|
||||
const artifacts = await SmartArchive.fromArchiveUrl(
|
||||
`${CI_SERVER}/artifacts/build-${BUILD_ID}.zip`
|
||||
);
|
||||
await artifacts.exportToFs('./dist');
|
||||
```
|
||||
|
||||
### Data Processing
|
||||
```typescript
|
||||
// Process compressed datasets
|
||||
const dataset = await SmartArchive.fromArchiveUrl(
|
||||
'https://data.source/dataset.tar.bz2'
|
||||
);
|
||||
const files = await dataset.exportToStreamOfStreamFiles();
|
||||
// Process each file in the dataset
|
||||
```
|
||||
|
||||
## License and Legal Information
|
||||
|
||||
@@ -279,4 +346,4 @@ Registered at District court Bremen HRB 35230 HB, Germany
|
||||
|
||||
For any legal inquiries or if you require further information, please contact us via email at hello@task.vc.
|
||||
|
||||
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.
|
||||
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.
|
Reference in New Issue
Block a user