Compare commits

...

4 Commits

13 changed files with 9564 additions and 1675 deletions

View File

@@ -1,5 +1,23 @@
# Changelog # Changelog
## 2025-11-25 - 4.2.4 - fix(plugins)
Migrate filesystem usage to Node fs/fsPromises and upgrade smartfile to v13; add listFileTree helper and update tests
- Bumped dependency @push.rocks/smartfile to ^13.0.0 and removed unused dependency `through`
- Replaced usages of smartfile.fs and smartfile.fsStream with Node native fs and fs/promises (createReadStream/createWriteStream, mkdir({recursive:true}), stat, readFile)
- Added plugins.listFileTree helper (recursive directory lister) and used it in TarTools.packDirectory and tests
- Updated SmartArchive.exportToFs to use plugins.fs and plugins.fsPromises for directory creation and file writes
- Updated TarTools to use plugins.fs.createReadStream and plugins.fsPromises.stat when packing directories
- Converted/updated tests to a Node/Deno-friendly test file (test.node+deno.ts) and switched test helpers to use fsPromises
- Added readme.hints.md with migration notes for Smartfile v13 and architecture/dependency notes
## 2025-11-25 - 4.2.3 - fix(build)
Upgrade dev tooling: bump @git.zone/tsbuild, @git.zone/tsrun and @git.zone/tstest versions
- Bump @git.zone/tsbuild from ^2.6.6 to ^3.1.0
- Bump @git.zone/tsrun from ^1.3.3 to ^2.0.0
- Bump @git.zone/tstest from ^2.3.4 to ^3.1.3
## 2025-08-18 - 4.2.2 - fix(smartarchive) ## 2025-08-18 - 4.2.2 - fix(smartarchive)
Improve tar entry streaming handling and add in-memory gzip/tgz tests Improve tar entry streaming handling and add in-memory gzip/tgz tests

6945
deno.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
{ {
"name": "@push.rocks/smartarchive", "name": "@push.rocks/smartarchive",
"version": "4.2.2", "version": "4.2.4",
"description": "A library for working with archive files, providing utilities for compressing and decompressing data.", "description": "A library for working with archive files, providing utilities for compressing and decompressing data.",
"main": "dist_ts/index.js", "main": "dist_ts/index.js",
"typings": "dist_ts/index.d.ts", "typings": "dist_ts/index.d.ts",
@@ -22,7 +22,7 @@
"homepage": "https://code.foss.global/push.rocks/smartarchive#readme", "homepage": "https://code.foss.global/push.rocks/smartarchive#readme",
"dependencies": { "dependencies": {
"@push.rocks/smartdelay": "^3.0.5", "@push.rocks/smartdelay": "^3.0.5",
"@push.rocks/smartfile": "^11.2.7", "@push.rocks/smartfile": "^13.0.0",
"@push.rocks/smartpath": "^6.0.0", "@push.rocks/smartpath": "^6.0.0",
"@push.rocks/smartpromise": "^4.2.3", "@push.rocks/smartpromise": "^4.2.3",
"@push.rocks/smartrequest": "^4.2.2", "@push.rocks/smartrequest": "^4.2.2",
@@ -33,13 +33,12 @@
"@types/tar-stream": "^3.1.4", "@types/tar-stream": "^3.1.4",
"fflate": "^0.8.2", "fflate": "^0.8.2",
"file-type": "^21.0.0", "file-type": "^21.0.0",
"tar-stream": "^3.1.7", "tar-stream": "^3.1.7"
"through": "^2.3.8"
}, },
"devDependencies": { "devDependencies": {
"@git.zone/tsbuild": "^2.6.6", "@git.zone/tsbuild": "^3.1.0",
"@git.zone/tsrun": "^1.3.3", "@git.zone/tsrun": "^2.0.0",
"@git.zone/tstest": "^2.3.4" "@git.zone/tstest": "^3.1.3"
}, },
"private": false, "private": false,
"files": [ "files": [

3586
pnpm-lock.yaml generated

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1,38 @@
# Smartarchive Development Hints
## Dependency Upgrades (2025-01-25)
### Completed Upgrades
- **@git.zone/tsbuild**: ^2.6.6 → ^3.1.0
- **@git.zone/tsrun**: ^1.3.3 → ^2.0.0
- **@git.zone/tstest**: ^2.3.4 → ^3.1.3
- **@push.rocks/smartfile**: ^11.2.7 → ^13.0.0
### Migration Notes
#### Smartfile v13 Migration
Smartfile v13 removed filesystem operations (`fs`, `memory`, `fsStream` namespaces). These were replaced with Node.js native `fs` and `fs/promises`:
**Replacements made:**
- `smartfile.fs.ensureDir(path)``fsPromises.mkdir(path, { recursive: true })`
- `smartfile.fs.stat(path)``fsPromises.stat(path)`
- `smartfile.fs.toReadStream(path)``fs.createReadStream(path)`
- `smartfile.fs.toStringSync(path)``fsPromises.readFile(path, 'utf8')`
- `smartfile.fs.listFileTree(dir, pattern)` → custom `listFileTree()` helper
- `smartfile.fsStream.createReadStream(path)``fs.createReadStream(path)`
- `smartfile.fsStream.createWriteStream(path)``fs.createWriteStream(path)`
- `smartfile.memory.toFs(content, path)``fsPromises.writeFile(path, content)`
**Still using from smartfile v13:**
- `SmartFile` class (in-memory file representation)
- `StreamFile` class (streaming file handling)
### Removed Dependencies
- `through@2.3.8` - was unused in the codebase
## Architecture Notes
- Uses `fflate` for ZIP/GZIP compression (pure JS, works in browser)
- Uses `tar-stream` for TAR archive handling
- Uses `file-type` for MIME type detection
- Custom BZIP2 implementation in `ts/bzip2/` directory

411
readme.md
View File

@@ -1,29 +1,32 @@
# @push.rocks/smartarchive 📦 # @push.rocks/smartarchive 📦
**Powerful archive manipulation for modern Node.js applications** Powerful archive manipulation for modern Node.js applications.
`@push.rocks/smartarchive` is a versatile library for handling archive files with a focus on developer experience. Work with **zip**, **tar**, **gzip**, and **bzip2** formats through a unified, streaming-optimized API. `@push.rocks/smartarchive` is a versatile library for handling archive files with a focus on developer experience. Work with **zip**, **tar**, **gzip**, and **bzip2** formats through a unified, streaming-optimized API.
## Issue Reporting and Security
For reporting bugs, issues, or security vulnerabilities, please visit [community.foss.global/](https://community.foss.global/). This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a [code.foss.global/](https://code.foss.global/) account to submit Pull Requests directly.
## Features 🚀 ## Features 🚀
- 📁 **Multi-format support** - Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives - 📁 **Multi-format support** Handle `.zip`, `.tar`, `.tar.gz`, `.tgz`, and `.bz2` archives
- 🌊 **Streaming-first architecture** - Process large archives without memory constraints - 🌊 **Streaming-first architecture** Process large archives without memory constraints
- 🔄 **Unified API** - Consistent interface across different archive formats - 🔄 **Unified API** Consistent interface across different archive formats
- 🎯 **Smart detection** - Automatically identifies archive types - 🎯 **Smart detection** Automatically identifies archive types via magic bytes
-**High performance** - Optimized for speed with parallel processing where possible -**High performance** Built on `tar-stream` and `fflate` for speed
- 🔧 **Flexible I/O** - Work with files, URLs, and streams seamlessly - 🔧 **Flexible I/O** Work with files, URLs, and streams seamlessly
- 📊 **Archive analysis** - Inspect contents without extraction - 🛠️ **Modern TypeScript** Full type safety and excellent IDE support
- 🛠️ **Modern TypeScript** - Full type safety and excellent IDE support
## Installation 📥 ## Installation 📥
```bash ```bash
# Using npm
npm install @push.rocks/smartarchive
# Using pnpm (recommended) # Using pnpm (recommended)
pnpm add @push.rocks/smartarchive pnpm add @push.rocks/smartarchive
# Using npm
npm install @push.rocks/smartarchive
# Using yarn # Using yarn
yarn add @push.rocks/smartarchive yarn add @push.rocks/smartarchive
``` ```
@@ -37,7 +40,7 @@ import { SmartArchive } from '@push.rocks/smartarchive';
// Extract a .tar.gz archive from a URL directly to the filesystem // Extract a .tar.gz archive from a URL directly to the filesystem
const archive = await SmartArchive.fromArchiveUrl( const archive = await SmartArchive.fromArchiveUrl(
'https://github.com/some/repo/archive/main.tar.gz' 'https://registry.npmjs.org/some-package/-/some-package-1.0.0.tgz'
); );
await archive.exportToFs('./extracted'); await archive.exportToFs('./extracted');
``` ```
@@ -52,10 +55,15 @@ const archive = await SmartArchive.fromArchiveFile('./large-archive.zip');
const streamOfFiles = await archive.exportToStreamOfStreamFiles(); const streamOfFiles = await archive.exportToStreamOfStreamFiles();
// Process each file in the archive // Process each file in the archive
streamOfFiles.on('data', (fileStream) => { streamOfFiles.on('data', async (streamFile) => {
console.log(`Processing ${fileStream.path}`); console.log(`Processing ${streamFile.relativeFilePath}`);
const readStream = await streamFile.createReadStream();
// Handle individual file stream // Handle individual file stream
}); });
streamOfFiles.on('end', () => {
console.log('Extraction complete');
});
``` ```
## Core Concepts 💡 ## Core Concepts 💡
@@ -64,17 +72,18 @@ streamOfFiles.on('data', (fileStream) => {
`SmartArchive` accepts archives from three sources: `SmartArchive` accepts archives from three sources:
1. **URL** - Download and process archives from the web | Source | Method | Use Case |
2. **File** - Load archives from the local filesystem |--------|--------|----------|
3. **Stream** - Process archives from any Node.js stream | **URL** | `SmartArchive.fromArchiveUrl(url)` | Download and process archives from the web |
| **File** | `SmartArchive.fromArchiveFile(path)` | Load archives from the local filesystem |
| **Stream** | `SmartArchive.fromArchiveStream(stream)` | Process archives from any Node.js stream |
### Export Destinations ### Export Destinations
Extract archives to multiple destinations: | Destination | Method | Use Case |
|-------------|--------|----------|
1. **Filesystem** - Extract directly to a directory | **Filesystem** | `exportToFs(targetDir, fileName?)` | Extract directly to a directory |
2. **Stream of files** - Process files individually as streams | **Stream of files** | `exportToStreamOfStreamFiles()` | Process files individually as `StreamFile` objects |
3. **Archive stream** - Re-stream as different format
## Usage Examples 🔨 ## Usage Examples 🔨
@@ -89,10 +98,11 @@ await zipArchive.exportToFs('./output');
// Stream ZIP contents for processing // Stream ZIP contents for processing
const fileStream = await zipArchive.exportToStreamOfStreamFiles(); const fileStream = await zipArchive.exportToStreamOfStreamFiles();
fileStream.on('data', (file) => {
if (file.path.endsWith('.json')) { fileStream.on('data', async (streamFile) => {
if (streamFile.relativeFilePath.endsWith('.json')) {
const readStream = await streamFile.createReadStream();
// Process JSON files from the archive // Process JSON files from the archive
file.pipe(jsonProcessor);
} }
}); });
``` ```
@@ -106,10 +116,38 @@ import { SmartArchive, TarTools } from '@push.rocks/smartarchive';
const tarGzArchive = await SmartArchive.fromArchiveFile('./archive.tar.gz'); const tarGzArchive = await SmartArchive.fromArchiveFile('./archive.tar.gz');
await tarGzArchive.exportToFs('./extracted'); await tarGzArchive.exportToFs('./extracted');
// Create a TAR archive (using TarTools directly) // Create a TAR archive using TarTools directly
const tarTools = new TarTools(); const tarTools = new TarTools();
const packStream = await tarTools.packDirectory('./source-directory'); const pack = await tarTools.getPackStream();
packStream.pipe(createWriteStream('./output.tar'));
// Add files to the pack
await tarTools.addFileToPack(pack, {
fileName: 'hello.txt',
content: 'Hello, World!'
});
await tarTools.addFileToPack(pack, {
fileName: 'data.json',
content: Buffer.from(JSON.stringify({ foo: 'bar' }))
});
// Finalize and pipe to destination
pack.finalize();
pack.pipe(createWriteStream('./output.tar'));
```
### Pack a directory into TAR
```typescript
import { TarTools } from '@push.rocks/smartarchive';
import { createWriteStream } from 'fs';
const tarTools = new TarTools();
// Pack an entire directory
const pack = await tarTools.packDirectory('./src');
pack.finalize();
pack.pipe(createWriteStream('./source.tar'));
``` ```
### Extracting from URLs ### Extracting from URLs
@@ -117,47 +155,36 @@ packStream.pipe(createWriteStream('./output.tar'));
```typescript ```typescript
import { SmartArchive } from '@push.rocks/smartarchive'; import { SmartArchive } from '@push.rocks/smartarchive';
// Download and extract in one operation // Download and extract npm packages
const remoteArchive = await SmartArchive.fromArchiveUrl( const npmPackage = await SmartArchive.fromArchiveUrl(
'https://example.com/data.tar.gz' 'https://registry.npmjs.org/@push.rocks/smartfile/-/smartfile-11.2.7.tgz'
); );
await npmPackage.exportToFs('./node_modules/@push.rocks/smartfile');
// Extract to filesystem // Or process as stream for memory efficiency
await remoteArchive.exportToFs('./local-dir'); const stream = await npmPackage.exportToStreamOfStreamFiles();
stream.on('data', async (file) => {
// Or process as stream console.log(`Extracted: ${file.relativeFilePath}`);
const stream = await remoteArchive.exportToStreamOfStreamFiles(); });
```
### Analyzing archive contents
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
// Analyze without extracting
const archive = await SmartArchive.fromArchiveFile('./archive.zip');
const analyzer = archive.archiveAnalyzer;
// Use the analyzer to inspect contents
// (exact implementation depends on analyzer methods)
``` ```
### Working with GZIP files ### Working with GZIP files
```typescript ```typescript
import { SmartArchive, GzipTools } from '@push.rocks/smartarchive'; import { SmartArchive, GzipTools } from '@push.rocks/smartarchive';
import { createReadStream, createWriteStream } from 'fs';
// Decompress a .gz file // Decompress a .gz file - provide filename since gzip doesn't store it
const gzipArchive = await SmartArchive.fromArchiveFile('./data.json.gz'); const gzipArchive = await SmartArchive.fromArchiveFile('./data.json.gz');
await gzipArchive.exportToFs('./decompressed', 'data.json'); await gzipArchive.exportToFs('./decompressed', 'data.json');
// Use GzipTools directly for streaming // Use GzipTools directly for streaming decompression
const gzipTools = new GzipTools(); const gzipTools = new GzipTools();
const decompressStream = gzipTools.getDecompressionStream(); const decompressStream = gzipTools.getDecompressionStream();
createReadStream('./compressed.gz') createReadStream('./compressed.gz')
.pipe(decompressStream) .pipe(decompressStream)
.pipe(createWriteStream('./decompressed')); .pipe(createWriteStream('./decompressed.txt'));
``` ```
### Working with BZIP2 files ### Working with BZIP2 files
@@ -172,115 +199,175 @@ const bzipArchive = await SmartArchive.fromArchiveUrl(
await bzipArchive.exportToFs('./extracted', 'data.txt'); await bzipArchive.exportToFs('./extracted', 'data.txt');
``` ```
### Advanced streaming operations ### In-memory processing (no filesystem)
```typescript ```typescript
import { SmartArchive } from '@push.rocks/smartarchive'; import { SmartArchive } from '@push.rocks/smartarchive';
import { pipeline } from 'stream/promises'; import { Readable } from 'stream';
// Chain operations with streams // Process archives entirely in memory
const archive = await SmartArchive.fromArchiveFile('./archive.tar.gz'); const compressedBuffer = await fetchCompressedData();
const exportStream = await archive.exportToStreamOfStreamFiles(); const memoryStream = Readable.from(compressedBuffer);
// Process each file in the archive const archive = await SmartArchive.fromArchiveStream(memoryStream);
await pipeline( const streamFiles = await archive.exportToStreamOfStreamFiles();
exportStream,
async function* (source) {
for await (const file of source) {
if (file.path.endsWith('.log')) {
// Process log files
yield processLogFile(file);
}
}
},
createWriteStream('./processed-logs.txt')
);
```
### Creating archives (advanced) const extractedFiles: Array<{ name: string; content: Buffer }> = [];
```typescript streamFiles.on('data', async (streamFile) => {
import { SmartArchive } from '@push.rocks/smartarchive'; const chunks: Buffer[] = [];
import { TarTools } from '@push.rocks/smartarchive'; const readStream = await streamFile.createReadStream();
// Using SmartArchive to create an archive for await (const chunk of readStream) {
const archive = new SmartArchive(); chunks.push(chunk);
// Add content to the archive
archive.addedDirectories.push('./src');
archive.addedFiles.push('./readme.md');
archive.addedFiles.push('./package.json');
// Export as TAR.GZ
const tarGzStream = await archive.exportToTarGzStream();
tarGzStream.pipe(createWriteStream('./output.tar.gz'));
```
### Extract and transform
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
import { Transform } from 'stream';
// Extract and transform files in one pipeline
const archive = await SmartArchive.fromArchiveUrl(
'https://example.com/source-code.tar.gz'
);
const extractStream = await archive.exportToStreamOfStreamFiles();
// Transform TypeScript to JavaScript during extraction
extractStream.on('data', (fileStream) => {
if (fileStream.path.endsWith('.ts')) {
fileStream
.pipe(typescriptTranspiler())
.pipe(createWriteStream(fileStream.path.replace('.ts', '.js')));
} else {
fileStream.pipe(createWriteStream(fileStream.path));
} }
extractedFiles.push({
name: streamFile.relativeFilePath,
content: Buffer.concat(chunks)
});
}); });
await new Promise((resolve) => streamFiles.on('end', resolve));
console.log(`Extracted ${extractedFiles.length} files in memory`);
```
### Nested archive handling (e.g., .tar.gz)
The library automatically handles nested compression. A `.tar.gz` file is:
1. First decompressed from gzip
2. Then unpacked from tar
This happens transparently:
```typescript
import { SmartArchive } from '@push.rocks/smartarchive';
// Automatically handles gzip → tar extraction chain
const tgzArchive = await SmartArchive.fromArchiveFile('./package.tar.gz');
await tgzArchive.exportToFs('./extracted');
``` ```
## API Reference 📚 ## API Reference 📚
### SmartArchive Class ### SmartArchive Class
#### Static Methods The main entry point for archive operations.
- `SmartArchive.fromArchiveUrl(url: string)` - Create from URL #### Static Factory Methods
- `SmartArchive.fromArchiveFile(path: string)` - Create from file
- `SmartArchive.fromArchiveStream(stream: NodeJS.ReadableStream)` - Create from stream ```typescript
// Create from URL - downloads and processes archive
SmartArchive.fromArchiveUrl(url: string): Promise<SmartArchive>
// Create from local file path
SmartArchive.fromArchiveFile(path: string): Promise<SmartArchive>
// Create from any Node.js readable stream
SmartArchive.fromArchiveStream(stream: Readable | Duplex | Transform): Promise<SmartArchive>
```
#### Instance Methods #### Instance Methods
- `exportToFs(targetDir: string, fileName?: string)` - Extract to filesystem ```typescript
- `exportToStreamOfStreamFiles()` - Get a stream of file streams // Extract all files to a directory
- `exportToTarGzStream()` - Export as TAR.GZ stream // fileName is optional - used for single-file archives (like .gz) that don't store filename
- `getArchiveStream()` - Get the raw archive stream exportToFs(targetDir: string, fileName?: string): Promise<void>
#### Properties // Get a stream that emits StreamFile objects for each file in the archive
exportToStreamOfStreamFiles(): Promise<StreamIntake<StreamFile>>
- `archiveAnalyzer` - Analyze archive contents // Get the raw archive stream (useful for piping)
- `tarTools` - TAR-specific operations getArchiveStream(): Promise<Readable>
- `zipTools` - ZIP-specific operations ```
- `gzipTools` - GZIP-specific operations
- `bzip2Tools` - BZIP2-specific operations
### Specialized Tools #### Instance Properties
Each tool class provides format-specific operations: ```typescript
archive.tarTools // TarTools instance for TAR-specific operations
archive.zipTools // ZipTools instance for ZIP-specific operations
archive.gzipTools // GzipTools instance for GZIP-specific operations
archive.bzip2Tools // Bzip2Tools instance for BZIP2-specific operations
archive.archiveAnalyzer // ArchiveAnalyzer for inspecting archive type
```
- **TarTools** - Pack/unpack TAR archives ### TarTools Class
- **ZipTools** - Handle ZIP compression
- **GzipTools** - GZIP compression/decompression TAR-specific operations for creating and extracting TAR archives.
- **Bzip2Tools** - BZIP2 operations
```typescript
import { TarTools } from '@push.rocks/smartarchive';
const tarTools = new TarTools();
// Get a tar pack stream for creating archives
const pack = await tarTools.getPackStream();
// Add files to a pack stream
await tarTools.addFileToPack(pack, {
fileName: 'file.txt', // Name in archive
content: 'Hello World', // String, Buffer, Readable, SmartFile, or StreamFile
byteLength?: number, // Optional: specify size for streams
filePath?: string // Optional: path to file on disk
});
// Pack an entire directory
const pack = await tarTools.packDirectory('./src');
// Get extraction stream
const extract = tarTools.getDecompressionStream();
```
### ZipTools Class
ZIP-specific operations.
```typescript
import { ZipTools } from '@push.rocks/smartarchive';
const zipTools = new ZipTools();
// Get compression stream (for creating ZIP)
const compressor = zipTools.getCompressionStream();
// Get decompression stream (for extracting ZIP)
const decompressor = zipTools.getDecompressionStream();
```
### GzipTools Class
GZIP compression/decompression streams.
```typescript
import { GzipTools } from '@push.rocks/smartarchive';
const gzipTools = new GzipTools();
// Get compression stream
const compressor = gzipTools.getCompressionStream();
// Get decompression stream
const decompressor = gzipTools.getDecompressionStream();
```
## Supported Formats 📋
| Format | Extension(s) | Extract | Create |
|--------|--------------|---------|--------|
| TAR | `.tar` | ✅ | ✅ |
| TAR.GZ / TGZ | `.tar.gz`, `.tgz` | ✅ | ⚠️ |
| ZIP | `.zip` | ✅ | ⚠️ |
| GZIP | `.gz` | ✅ | ✅ |
| BZIP2 | `.bz2` | ✅ | ❌ |
✅ Full support | ⚠️ Partial/basic support | ❌ Not supported
## Performance Tips 🏎️ ## Performance Tips 🏎️
1. **Use streaming for large files** - Avoid loading entire archives into memory 1. **Use streaming for large files** Avoid loading entire archives into memory with `exportToStreamOfStreamFiles()`
2. **Process files in parallel** - Utilize stream operations for concurrent processing 2. **Provide byte lengths when known** When adding streams to TAR, provide `byteLength` for better performance
3. **Choose the right format** - TAR.GZ for Unix systems, ZIP for cross-platform compatibility 3. **Process files as they stream** Don't collect all files into an array unless necessary
4. **Enable compression wisely** - Balance between file size and CPU usage 4. **Choose the right format** TAR.GZ for Unix/compression, ZIP for cross-platform compatibility
## Error Handling 🛡️ ## Error Handling 🛡️
@@ -295,6 +382,8 @@ try {
console.error('Archive file not found'); console.error('Archive file not found');
} else if (error.code === 'EACCES') { } else if (error.code === 'EACCES') {
console.error('Permission denied'); console.error('Permission denied');
} else if (error.message.includes('fetch')) {
console.error('Network error downloading archive');
} else { } else {
console.error('Archive extraction failed:', error.message); console.error('Archive extraction failed:', error.message);
} }
@@ -303,35 +392,57 @@ try {
## Real-World Use Cases 🌍 ## Real-World Use Cases 🌍
### Backup System ### CI/CD: Download & Extract Build Artifacts
```typescript
// Automated backup extraction
const backup = await SmartArchive.fromArchiveFile('./backup.tar.gz');
await backup.exportToFs('/restore/location');
```
### CI/CD Pipeline
```typescript ```typescript
// Download and extract build artifacts
const artifacts = await SmartArchive.fromArchiveUrl( const artifacts = await SmartArchive.fromArchiveUrl(
`${CI_SERVER}/artifacts/build-${BUILD_ID}.zip` `${CI_SERVER}/artifacts/build-${BUILD_ID}.zip`
); );
await artifacts.exportToFs('./dist'); await artifacts.exportToFs('./dist');
``` ```
### Data Processing ### Backup System: Restore from Archive
```typescript ```typescript
// Process compressed datasets const backup = await SmartArchive.fromArchiveFile('./backup-2024.tar.gz');
const dataset = await SmartArchive.fromArchiveUrl( await backup.exportToFs('/restore/location');
'https://data.source/dataset.tar.bz2' ```
### NPM Package Inspection
```typescript
const pkg = await SmartArchive.fromArchiveUrl(
'https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz'
); );
const files = await pkg.exportToStreamOfStreamFiles();
files.on('data', async (file) => {
if (file.relativeFilePath.includes('package.json')) {
const stream = await file.createReadStream();
// Read and analyze package.json
}
});
```
### Data Pipeline: Process Compressed Datasets
```typescript
const dataset = await SmartArchive.fromArchiveUrl(
'https://data.source/dataset.tar.gz'
);
const files = await dataset.exportToStreamOfStreamFiles(); const files = await dataset.exportToStreamOfStreamFiles();
// Process each file in the dataset files.on('data', async (file) => {
if (file.relativeFilePath.endsWith('.csv')) {
const stream = await file.createReadStream();
// Stream CSV processing
}
});
``` ```
## License and Legal Information ## License and Legal Information
This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository. This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository.
**Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file. **Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
@@ -341,9 +452,9 @@ This project is owned and maintained by Task Venture Capital GmbH. The names and
### Company Information ### Company Information
Task Venture Capital GmbH Task Venture Capital GmbH
Registered at District court Bremen HRB 35230 HB, Germany Registered at District court Bremen HRB 35230 HB, Germany
For any legal inquiries or if you require further information, please contact us via email at hello@task.vc. For any legal inquiries or if you require further information, please contact us via email at hello@task.vc.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works. By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.

View File

@@ -1,7 +1,33 @@
import * as path from 'path'; import * as path from 'node:path';
import * as fs from 'node:fs';
import * as fsPromises from 'node:fs/promises';
import * as smartpath from '@push.rocks/smartpath'; import * as smartpath from '@push.rocks/smartpath';
import * as smartfile from '@push.rocks/smartfile'; import * as smartfile from '@push.rocks/smartfile';
import * as smartrequest from '@push.rocks/smartrequest'; import * as smartrequest from '@push.rocks/smartrequest';
import * as smartstream from '@push.rocks/smartstream'; import * as smartstream from '@push.rocks/smartstream';
export { path, smartpath, smartfile, smartrequest, smartstream }; export { path, fs, fsPromises, smartpath, smartfile, smartrequest, smartstream };
/**
* List files in a directory recursively, returning relative paths
*/
export async function listFileTree(dirPath: string, _pattern: string = '**/*'): Promise<string[]> {
const results: string[] = [];
async function walkDir(currentPath: string, relativePath: string = '') {
const entries = await fsPromises.readdir(currentPath, { withFileTypes: true });
for (const entry of entries) {
const entryRelPath = relativePath ? path.join(relativePath, entry.name) : entry.name;
const entryFullPath = path.join(currentPath, entry.name);
if (entry.isDirectory()) {
await walkDir(entryFullPath, entryRelPath);
} else if (entry.isFile()) {
results.push(entryRelPath);
}
}
}
await walkDir(dirPath);
return results;
}

View File

@@ -14,7 +14,7 @@ const testPaths = {
}; };
tap.preTask('should prepare test directories', async () => { tap.preTask('should prepare test directories', async () => {
await plugins.smartfile.fs.ensureDir(testPaths.gzipTestDir); await plugins.fsPromises.mkdir(testPaths.gzipTestDir, { recursive: true });
}); });
tap.test('should create and extract a gzip file', async () => { tap.test('should create and extract a gzip file', async () => {
@@ -24,23 +24,17 @@ tap.test('should create and extract a gzip file', async () => {
const gzipFileName = 'test-file.txt.gz'; const gzipFileName = 'test-file.txt.gz';
// Write the original file // Write the original file
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
testContent, plugins.path.join(testPaths.gzipTestDir, testFileName),
plugins.path.join(testPaths.gzipTestDir, testFileName) testContent
); );
// Compress the file using gzip
const originalFile = await plugins.smartfile.fs.fileTreeToObject(
testPaths.gzipTestDir,
testFileName
);
// Create gzip compressed version using fflate directly // Create gzip compressed version using fflate directly
const fflate = await import('fflate'); const fflate = await import('fflate');
const compressed = fflate.gzipSync(Buffer.from(testContent)); const compressed = fflate.gzipSync(Buffer.from(testContent));
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
Buffer.from(compressed), plugins.path.join(testPaths.gzipTestDir, gzipFileName),
plugins.path.join(testPaths.gzipTestDir, gzipFileName) Buffer.from(compressed)
); );
// Now test extraction using SmartArchive // Now test extraction using SmartArchive
@@ -50,13 +44,14 @@ tap.test('should create and extract a gzip file', async () => {
// Export to a new location // Export to a new location
const extractPath = plugins.path.join(testPaths.gzipTestDir, 'extracted'); const extractPath = plugins.path.join(testPaths.gzipTestDir, 'extracted');
await plugins.smartfile.fs.ensureDir(extractPath); await plugins.fsPromises.mkdir(extractPath, { recursive: true });
// Provide a filename since gzip doesn't contain filename metadata // Provide a filename since gzip doesn't contain filename metadata
await gzipArchive.exportToFs(extractPath, 'test-file.txt'); await gzipArchive.exportToFs(extractPath, 'test-file.txt');
// Read the extracted file // Read the extracted file
const extractedContent = await plugins.smartfile.fs.toStringSync( const extractedContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, 'test-file.txt') plugins.path.join(extractPath, 'test-file.txt'),
'utf8'
); );
// Verify the content matches // Verify the content matches
@@ -71,13 +66,13 @@ tap.test('should handle gzip stream extraction', async () => {
// Create gzip compressed version // Create gzip compressed version
const fflate = await import('fflate'); const fflate = await import('fflate');
const compressed = fflate.gzipSync(Buffer.from(testContent)); const compressed = fflate.gzipSync(Buffer.from(testContent));
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
Buffer.from(compressed), plugins.path.join(testPaths.gzipTestDir, gzipFileName),
plugins.path.join(testPaths.gzipTestDir, gzipFileName) Buffer.from(compressed)
); );
// Create a read stream for the gzip file // Create a read stream for the gzip file
const gzipStream = plugins.smartfile.fsStream.createReadStream( const gzipStream = plugins.fs.createReadStream(
plugins.path.join(testPaths.gzipTestDir, gzipFileName) plugins.path.join(testPaths.gzipTestDir, gzipFileName)
); );
@@ -121,7 +116,7 @@ tap.test('should handle gzip files with original filename in header', async () =
const gzipFileName = 'compressed.gz'; const gzipFileName = 'compressed.gz';
// Create a proper gzip with filename header using Node's zlib // Create a proper gzip with filename header using Node's zlib
const zlib = await import('zlib'); const zlib = await import('node:zlib');
const gzipBuffer = await new Promise<Buffer>((resolve, reject) => { const gzipBuffer = await new Promise<Buffer>((resolve, reject) => {
zlib.gzip(Buffer.from(testContent), { zlib.gzip(Buffer.from(testContent), {
level: 9, level: 9,
@@ -133,29 +128,30 @@ tap.test('should handle gzip files with original filename in header', async () =
}); });
}); });
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
gzipBuffer, plugins.path.join(testPaths.gzipTestDir, gzipFileName),
plugins.path.join(testPaths.gzipTestDir, gzipFileName) gzipBuffer
); );
// Test extraction // Test extraction
const gzipArchive = await smartarchive.SmartArchive.fromArchiveFile( const gzipArchive = await smartarchive.SmartArchive.fromArchiveFile(
plugins.path.join(testPaths.gzipTestDir, gzipFileName) plugins.path.join(testPaths.gzipTestDir, gzipFileName)
); );
const extractPath = plugins.path.join(testPaths.gzipTestDir, 'header-test'); const extractPath = plugins.path.join(testPaths.gzipTestDir, 'header-test');
await plugins.smartfile.fs.ensureDir(extractPath); await plugins.fsPromises.mkdir(extractPath, { recursive: true });
// Provide a filename since gzip doesn't reliably contain filename metadata // Provide a filename since gzip doesn't reliably contain filename metadata
await gzipArchive.exportToFs(extractPath, 'compressed.txt'); await gzipArchive.exportToFs(extractPath, 'compressed.txt');
// Check if file was extracted (name might be derived from archive name) // Check if file was extracted (name might be derived from archive name)
const files = await plugins.smartfile.fs.listFileTree(extractPath, '**/*'); const files = await plugins.listFileTree(extractPath, '**/*');
expect(files.length).toBeGreaterThan(0); expect(files.length).toBeGreaterThan(0);
// Read and verify content // Read and verify content
const extractedFile = files[0]; const extractedFile = files[0];
const extractedContent = await plugins.smartfile.fs.toStringSync( const extractedContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, extractedFile || 'compressed.txt') plugins.path.join(extractPath, extractedFile || 'compressed.txt'),
'utf8'
); );
expect(extractedContent).toEqual(testContent); expect(extractedContent).toEqual(testContent);
}); });
@@ -168,27 +164,28 @@ tap.test('should handle large gzip files', async () => {
// Compress the large file // Compress the large file
const fflate = await import('fflate'); const fflate = await import('fflate');
const compressed = fflate.gzipSync(Buffer.from(largeContent)); const compressed = fflate.gzipSync(Buffer.from(largeContent));
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
Buffer.from(compressed), plugins.path.join(testPaths.gzipTestDir, gzipFileName),
plugins.path.join(testPaths.gzipTestDir, gzipFileName) Buffer.from(compressed)
); );
// Test extraction // Test extraction
const gzipArchive = await smartarchive.SmartArchive.fromArchiveFile( const gzipArchive = await smartarchive.SmartArchive.fromArchiveFile(
plugins.path.join(testPaths.gzipTestDir, gzipFileName) plugins.path.join(testPaths.gzipTestDir, gzipFileName)
); );
const extractPath = plugins.path.join(testPaths.gzipTestDir, 'large-extracted'); const extractPath = plugins.path.join(testPaths.gzipTestDir, 'large-extracted');
await plugins.smartfile.fs.ensureDir(extractPath); await plugins.fsPromises.mkdir(extractPath, { recursive: true });
// Provide a filename since gzip doesn't contain filename metadata // Provide a filename since gzip doesn't contain filename metadata
await gzipArchive.exportToFs(extractPath, 'large-file.txt'); await gzipArchive.exportToFs(extractPath, 'large-file.txt');
// Verify the extracted content // Verify the extracted content
const files = await plugins.smartfile.fs.listFileTree(extractPath, '**/*'); const files = await plugins.listFileTree(extractPath, '**/*');
expect(files.length).toBeGreaterThan(0); expect(files.length).toBeGreaterThan(0);
const extractedContent = await plugins.smartfile.fs.toStringSync( const extractedContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, files[0] || 'large-file.txt') plugins.path.join(extractPath, files[0] || 'large-file.txt'),
'utf8'
); );
expect(extractedContent.length).toEqual(largeContent.length); expect(extractedContent.length).toEqual(largeContent.length);
expect(extractedContent).toEqual(largeContent); expect(extractedContent).toEqual(largeContent);
@@ -200,60 +197,64 @@ tap.test('should handle real-world multi-chunk gzip from URL', async () => {
// Download and extract the archive // Download and extract the archive
const testArchive = await smartarchive.SmartArchive.fromArchiveUrl(testUrl); const testArchive = await smartarchive.SmartArchive.fromArchiveUrl(testUrl);
const extractPath = plugins.path.join(testPaths.gzipTestDir, 'real-world-test'); const extractPath = plugins.path.join(testPaths.gzipTestDir, 'real-world-test');
await plugins.smartfile.fs.ensureDir(extractPath); await plugins.fsPromises.mkdir(extractPath, { recursive: true });
// This will test multi-chunk decompression as the file is larger // This will test multi-chunk decompression as the file is larger
await testArchive.exportToFs(extractPath); await testArchive.exportToFs(extractPath);
// Verify extraction worked // Verify extraction worked
const files = await plugins.smartfile.fs.listFileTree(extractPath, '**/*'); const files = await plugins.listFileTree(extractPath, '**/*');
expect(files.length).toBeGreaterThan(0); expect(files.length).toBeGreaterThan(0);
// Check for expected package structure // Check for expected package structure
const hasPackageJson = files.some(f => f.includes('package.json')); const hasPackageJson = files.some(f => f.includes('package.json'));
expect(hasPackageJson).toBeTrue(); expect(hasPackageJson).toBeTrue();
// Read and verify package.json content // Read and verify package.json content
const packageJsonPath = files.find(f => f.includes('package.json')); const packageJsonPath = files.find(f => f.includes('package.json'));
if (packageJsonPath) { if (packageJsonPath) {
const packageJsonContent = await plugins.smartfile.fs.toStringSync( const packageJsonContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, packageJsonPath) plugins.path.join(extractPath, packageJsonPath),
'utf8'
); );
const packageJson = JSON.parse(packageJsonContent); const packageJson = JSON.parse(packageJsonContent);
expect(packageJson.name).toEqual('@push.rocks/smartfile'); expect(packageJson.name).toEqual('@push.rocks/smartfile');
expect(packageJson.version).toEqual('11.2.7'); expect(packageJson.version).toEqual('11.2.7');
} }
// Read and verify a TypeScript file // Read and verify a TypeScript file
const tsFilePath = files.find(f => f.endsWith('.ts')); const tsFilePath = files.find(f => f.endsWith('.ts'));
if (tsFilePath) { if (tsFilePath) {
const tsFileContent = await plugins.smartfile.fs.toStringSync( const tsFileContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, tsFilePath) plugins.path.join(extractPath, tsFilePath),
'utf8'
); );
// TypeScript files should have content // TypeScript files should have content
expect(tsFileContent.length).toBeGreaterThan(10); expect(tsFileContent.length).toBeGreaterThan(10);
console.log(` ✓ TypeScript file ${tsFilePath} has ${tsFileContent.length} bytes`); console.log(` ✓ TypeScript file ${tsFilePath} has ${tsFileContent.length} bytes`);
} }
// Read and verify license file // Read and verify license file
const licensePath = files.find(f => f.includes('license')); const licensePath = files.find(f => f.includes('license'));
if (licensePath) { if (licensePath) {
const licenseContent = await plugins.smartfile.fs.toStringSync( const licenseContent = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, licensePath) plugins.path.join(extractPath, licensePath),
'utf8'
); );
expect(licenseContent).toContain('MIT'); expect(licenseContent).toContain('MIT');
} }
// Verify we can read multiple files without corruption // Verify we can read multiple files without corruption
const readableFiles = files.filter(f => const readableFiles = files.filter(f =>
f.endsWith('.json') || f.endsWith('.md') || f.endsWith('.ts') || f.endsWith('.js') f.endsWith('.json') || f.endsWith('.md') || f.endsWith('.ts') || f.endsWith('.js')
).slice(0, 5); // Test first 5 readable files ).slice(0, 5); // Test first 5 readable files
for (const file of readableFiles) { for (const file of readableFiles) {
const content = await plugins.smartfile.fs.toStringSync( const content = await plugins.fsPromises.readFile(
plugins.path.join(extractPath, file) plugins.path.join(extractPath, file),
'utf8'
); );
expect(content).toBeDefined(); expect(content).toBeDefined();
expect(content.length).toBeGreaterThan(0); expect(content.length).toBeGreaterThan(0);
@@ -270,7 +271,7 @@ tap.test('should handle gzip extraction fully in memory', async () => {
const compressed = fflate.gzipSync(Buffer.from(testContent)); const compressed = fflate.gzipSync(Buffer.from(testContent));
// Create a stream from the compressed data // Create a stream from the compressed data
const { Readable } = await import('stream'); const { Readable } = await import('node:stream');
const compressedStream = Readable.from(Buffer.from(compressed)); const compressedStream = Readable.from(Buffer.from(compressed));
// Process through SmartArchive without touching filesystem // Process through SmartArchive without touching filesystem
@@ -318,7 +319,7 @@ tap.test('should handle real tgz file fully in memory', async (tools) => {
console.log(` Downloaded ${tgzBuffer.length} bytes into memory`); console.log(` Downloaded ${tgzBuffer.length} bytes into memory`);
// Create stream from buffer // Create stream from buffer
const { Readable: Readable2 } = await import('stream'); const { Readable: Readable2 } = await import('node:stream');
const tgzStream = Readable2.from(tgzBuffer); const tgzStream = Readable2.from(tgzBuffer);
// Process through SmartArchive in memory // Process through SmartArchive in memory

View File

@@ -16,7 +16,7 @@ const testPaths = {
import * as smartarchive from '../ts/index.js'; import * as smartarchive from '../ts/index.js';
tap.preTask('should prepare .nogit dir', async () => { tap.preTask('should prepare .nogit dir', async () => {
await plugins.smartfile.fs.ensureDir(testPaths.remoteDir); await plugins.fsPromises.mkdir(testPaths.remoteDir, { recursive: true });
}); });
tap.preTask('should prepare downloads', async (tools) => { tap.preTask('should prepare downloads', async (tools) => {
@@ -26,9 +26,9 @@ tap.preTask('should prepare downloads', async (tools) => {
) )
.get(); .get();
const downloadedFile: Buffer = Buffer.from(await response.arrayBuffer()); const downloadedFile: Buffer = Buffer.from(await response.arrayBuffer());
await plugins.smartfile.memory.toFs( await plugins.fsPromises.writeFile(
downloadedFile,
plugins.path.join(testPaths.nogitDir, 'test.tgz'), plugins.path.join(testPaths.nogitDir, 'test.tgz'),
downloadedFile,
); );
}); });

View File

@@ -3,6 +3,6 @@
*/ */
export const commitinfo = { export const commitinfo = {
name: '@push.rocks/smartarchive', name: '@push.rocks/smartarchive',
version: '4.2.2', version: '4.2.4',
description: 'A library for working with archive files, providing utilities for compressing and decompressing data.' description: 'A library for working with archive files, providing utilities for compressing and decompressing data.'
} }

View File

@@ -83,7 +83,7 @@ export class SmartArchive {
return urlStream; return urlStream;
} }
if (this.sourceFilePath) { if (this.sourceFilePath) {
const fileStream = plugins.smartfile.fs.toReadStream(this.sourceFilePath); const fileStream = plugins.fs.createReadStream(this.sourceFilePath);
return fileStream; return fileStream;
} }
} }
@@ -116,14 +116,13 @@ export class SmartArchive {
); );
const streamFile = streamFileArg; const streamFile = streamFileArg;
const readStream = await streamFile.createReadStream(); const readStream = await streamFile.createReadStream();
await plugins.smartfile.fs.ensureDir(targetDir); await plugins.fsPromises.mkdir(targetDir, { recursive: true });
const writePath = plugins.path.join( const writePath = plugins.path.join(
targetDir, targetDir,
streamFile.relativeFilePath || fileNameArg, streamFile.relativeFilePath || fileNameArg,
); );
await plugins.smartfile.fs.ensureDir(plugins.path.dirname(writePath)); await plugins.fsPromises.mkdir(plugins.path.dirname(writePath), { recursive: true });
const writeStream = const writeStream = plugins.fs.createWriteStream(writePath);
plugins.smartfile.fsStream.createWriteStream(writePath);
readStream.pipe(writeStream); readStream.pipe(writeStream);
writeStream.on('finish', () => { writeStream.on('finish', () => {
done.resolve(); done.resolve();

View File

@@ -55,7 +55,7 @@ export class TarTools {
'@push.rocks/smartarchive: When streaming, it is recommended to provide byteLength, if known.', '@push.rocks/smartarchive: When streaming, it is recommended to provide byteLength, if known.',
); );
} else if (optionsArg.filePath) { } else if (optionsArg.filePath) {
const fileStat = await plugins.smartfile.fs.stat(optionsArg.filePath); const fileStat = await plugins.fsPromises.stat(optionsArg.filePath);
contentByteLength = fileStat.size; contentByteLength = fileStat.size;
} }
@@ -109,19 +109,16 @@ export class TarTools {
* @param directoryPath * @param directoryPath
*/ */
public async packDirectory(directoryPath: string) { public async packDirectory(directoryPath: string) {
const fileTree = await plugins.smartfile.fs.listFileTree( const fileTree = await plugins.listFileTree(directoryPath, '**/*');
directoryPath,
'**/*',
);
const pack = await this.getPackStream(); const pack = await this.getPackStream();
for (const filePath of fileTree) { for (const filePath of fileTree) {
const absolutePath = plugins.path.join(directoryPath, filePath); const absolutePath = plugins.path.join(directoryPath, filePath);
const fileStat = await plugins.smartfile.fs.stat(absolutePath); const fileStat = await plugins.fsPromises.stat(absolutePath);
await this.addFileToPack(pack, { await this.addFileToPack(pack, {
byteLength: fileStat.size, byteLength: fileStat.size,
filePath: absolutePath, filePath: absolutePath,
fileName: filePath, fileName: filePath,
content: plugins.smartfile.fsStream.createReadStream(absolutePath), content: plugins.fs.createReadStream(absolutePath),
}); });
} }
return pack; return pack;

View File

@@ -1,8 +1,34 @@
// node native scope // node native scope
import * as path from 'path'; import * as path from 'node:path';
import * as stream from 'stream'; import * as stream from 'node:stream';
import * as fs from 'node:fs';
import * as fsPromises from 'node:fs/promises';
export { path, stream }; export { path, stream, fs, fsPromises };
/**
* List files in a directory recursively, returning relative paths
*/
export async function listFileTree(dirPath: string, _pattern: string = '**/*'): Promise<string[]> {
const results: string[] = [];
async function walkDir(currentPath: string, relativePath: string = '') {
const entries = await fsPromises.readdir(currentPath, { withFileTypes: true });
for (const entry of entries) {
const entryRelPath = relativePath ? path.join(relativePath, entry.name) : entry.name;
const entryFullPath = path.join(currentPath, entry.name);
if (entry.isDirectory()) {
await walkDir(entryFullPath, entryRelPath);
} else if (entry.isFile()) {
results.push(entryRelPath);
}
}
}
await walkDir(dirPath);
return results;
}
// @pushrocks scope // @pushrocks scope
import * as smartfile from '@push.rocks/smartfile'; import * as smartfile from '@push.rocks/smartfile';