fix(core): improve SmartPdf lifecycle management and update dependencies

0.5.6
feat(documentation): comprehensive documentation enhancement and test improvements
2025-08-01 18:25:46 +00:00 · 2025-07-26 16:17:11 +00:00 · 2025-07-25 18:00:23 +00:00 · 2025-05-13 18:39:58 +00:00 · 2025-05-13 18:39:57 +00:00 · 2025-04-03 21:46:40 +00:00
13 changed files with 3559 additions and 1397 deletions
--- a/changelog.md
+++ b/changelog.md
@@ -1,5 +1,67 @@
 # Changelog
 ## 2025-08-01 - 0.5.8 - fix(core)
 Fix SmartPdf lifecycle management and update dependencies
 - Moved SmartPdf instance management to the MultiModalModel base class for better resource sharing
 - Fixed memory leaks by properly implementing cleanup in the base class stop() method
 - Updated SmartAi class to properly stop all providers on shutdown
 - Updated @push.rocks/smartrequest from v2.1.0 to v4.2.1 with migration to new API
 - Enhanced readme with professional documentation and feature matrix
 ## 2025-07-26 - 0.5.7 - fix(provider.openai)
 Fix stream type mismatch in audio method
 - Fixed type error where OpenAI SDK returns a web ReadableStream but the audio method needs to return a Node.js ReadableStream
 - Added conversion using Node.js's built-in Readable.fromWeb() method
 ## 2025-07-25 - 0.5.5 - feat(documentation)
 Comprehensive documentation enhancement and test improvements
 - Completely rewrote readme.md with detailed provider comparisons, advanced usage examples, and performance tips
 - Added comprehensive examples for all supported providers (OpenAI, Anthropic, Perplexity, Groq, XAI, Ollama, Exo)
 - Included detailed sections on chat interactions, streaming, TTS, vision processing, and document analysis
 - Added verbose flag to test script for better debugging
 ## 2025-05-13 - 0.5.4 - fix(provider.openai)
 Update dependency versions, clean test imports, and adjust default OpenAI model configurations
 - Bump dependency versions in package.json (@git.zone/tsbuild, @push.rocks/tapbundle, openai, etc.)
 - Change default chatModel from 'gpt-4o' to 'o4-mini' and visionModel from 'gpt-4o' to '04-mini' in provider.openai.ts
 - Remove unused 'expectAsync' import from test file
 ## 2025-04-03 - 0.5.3 - fix(package.json)
 Add explicit packageManager field to package.json
 - Include the packageManager property to specify the pnpm version and checksum.
 - Align package metadata with current standards.
 ## 2025-04-03 - 0.5.2 - fix(readme)
 Remove redundant conclusion section from README to streamline documentation.
 - Eliminated the conclusion block describing SmartAi's capabilities and documentation pointers.
 ## 2025-02-25 - 0.5.1 - fix(OpenAiProvider)
 Corrected audio model ID in OpenAiProvider
 - Fixed audio model identifier from 'o3-mini' to 'tts-1-hd' in the OpenAiProvider's audio method.
 - Addressed minor code formatting issues in test suite for better readability.
 - Corrected spelling errors in test documentation and comments.
 ## 2025-02-25 - 0.5.0 - feat(documentation and configuration)
 Enhanced package and README documentation
 - Expanded the package description to better reflect the library's capabilities.
 - Improved README with detailed usage examples for initialization, chat interactions, streaming chat, audio generation, document analysis, and vision processing.
 - Provided error handling strategies and advanced streaming customization examples.
 ## 2025-02-25 - 0.4.2 - fix(core)
 Fix OpenAI chat streaming and PDF document processing logic.
 - Updated OpenAI chat streaming to handle new async iterable format.
 - Improved PDF document processing by filtering out empty image buffers.
 - Removed unsupported temperature options from OpenAI requests.
 ## 2025-02-25 - 0.4.1 - fix(provider)
 Fix provider modules for consistency
--- a/npmextra.json
+++ b/npmextra.json
@@ -5,20 +5,33 @@
      "githost": "code.foss.global",
      "gitscope": "push.rocks",
      "gitrepo": "smartai",
-      "description": "A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.",
+      "description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.",
      "npmPackagename": "@push.rocks/smartai",
      "license": "MIT",
      "projectDomain": "push.rocks",
      "keywords": [
        "AI integration",
        "chatbot",
        "TypeScript",
        "chatbot",
        "OpenAI",
        "Anthropic",
-        "multi-model support",
+        "multi-model",
-        "audio responses",
+        "audio generation",
        "text-to-speech",
-        "streaming chat"
+        "document processing",
        "vision processing",
        "streaming chat",
        "API",
        "multiple providers",
        "AI models",
        "synchronous chat",
        "asynchronous chat",
        "real-time interaction",
        "content analysis",
        "image description",
        "document classification",
        "AI toolkit",
        "provider switching"
      ]
    }
  },
--- a/package.json
+++ b/package.json
@@ -1,37 +1,37 @@
 {
  "name": "@push.rocks/smartai",
-  "version": "0.4.1",
+  "version": "0.5.8",
  "private": false,
-  "description": "A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.",
+  "description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.",
  "main": "dist_ts/index.js",
  "typings": "dist_ts/index.d.ts",
  "type": "module",
  "author": "Task Venture Capital GmbH",
  "license": "MIT",
  "scripts": {
-    "test": "(tstest test/ --web)",
+    "test": "(tstest test/ --web --verbose)",
    "build": "(tsbuild --web --allowimplicitany)",
    "buildDocs": "(tsdoc)"
  },
  "devDependencies": {
-    "@git.zone/tsbuild": "^2.2.1",
+    "@git.zone/tsbuild": "^2.6.4",
-    "@git.zone/tsbundle": "^2.2.5",
+    "@git.zone/tsbundle": "^2.5.1",
    "@git.zone/tsrun": "^1.3.3",
-    "@git.zone/tstest": "^1.0.96",
+    "@git.zone/tstest": "^2.3.2",
    "@push.rocks/qenv": "^6.1.0",
-    "@push.rocks/tapbundle": "^5.5.6",
+    "@push.rocks/tapbundle": "^6.0.3",
-    "@types/node": "^22.13.5"
+    "@types/node": "^22.15.17"
  },
  "dependencies": {
-    "@anthropic-ai/sdk": "^0.37.0",
+    "@anthropic-ai/sdk": "^0.57.0",
    "@push.rocks/smartarray": "^1.1.0",
-    "@push.rocks/smartfile": "^11.2.0",
+    "@push.rocks/smartfile": "^11.2.5",
-    "@push.rocks/smartpath": "^5.0.18",
+    "@push.rocks/smartpath": "^6.0.0",
-    "@push.rocks/smartpdf": "^3.1.8",
+    "@push.rocks/smartpdf": "^3.3.0",
    "@push.rocks/smartpromise": "^4.2.3",
-    "@push.rocks/smartrequest": "^2.0.23",
+    "@push.rocks/smartrequest": "^4.2.1",
    "@push.rocks/webstream": "^1.0.10",
-    "openai": "^4.85.4"
+    "openai": "^5.11.0"
  },
  "repository": {
    "type": "git",
@@ -58,13 +58,33 @@
  ],
  "keywords": [
    "AI integration",
    "chatbot",
    "TypeScript",
    "chatbot",
    "OpenAI",
    "Anthropic",
-    "multi-model support",
+    "multi-model",
-    "audio responses",
+    "audio generation",
    "text-to-speech",
-    "streaming chat"
+    "document processing",
-  ]
+    "vision processing",
    "streaming chat",
    "API",
    "multiple providers",
    "AI models",
    "synchronous chat",
    "asynchronous chat",
    "real-time interaction",
    "content analysis",
    "image description",
    "document classification",
    "AI toolkit",
    "provider switching"
  ],
  "pnpm": {
    "onlyBuiltDependencies": [
      "esbuild",
      "puppeteer"
    ]
  },
  "packageManager": "pnpm@10.7.0+sha512.6b865ad4b62a1d9842b61d674a393903b871d9244954f652b8842c2b553c72176b278f64c463e52d40fff8aba385c235c8c9ecf5cc7de4fd78b8bb6d49633ab6"
 }
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
--- a/readme.md
+++ b/readme.md
@@ -1,329 +1,475 @@
 # @push.rocks/smartai
 **One API to rule them all** 🚀
-[![npm version](https://badge.fury.io/js/%40push.rocks%2Fsmartai.svg)](https://www.npmjs.com/package/@push.rocks/smartai)
+[![npm version](https://img.shields.io/npm/v/@push.rocks/smartai.svg)](https://www.npmjs.com/package/@push.rocks/smartai)
 [![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-SmartAi is a comprehensive TypeScript library that provides a standardized interface for integrating and interacting with multiple AI models. It supports a range of operations from synchronous and streaming chat to audio generation, document processing, and vision tasks.
+SmartAI unifies the world's leading AI providers - OpenAI, Anthropic, Perplexity, Ollama, Groq, XAI, and Exo - under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in. 
-## Table of Contents
+## 🎯 Why SmartAI?
- [Features](#features)
+- **🔌 Universal Interface**: Write once, run with any AI provider. Switch between GPT-4, Claude, Llama, or Grok with a single line change.
- [Installation](#installation)
+- **🛡️ Type-Safe**: Full TypeScript support with comprehensive type definitions for all operations
- [Supported AI Providers](#supported-ai-providers)
+- **🌊 Streaming First**: Built for real-time applications with native streaming support
- [Quick Start](#quick-start)
+- **🎨 Multi-Modal**: Seamlessly work with text, images, audio, and documents
- [Usage Examples](#usage-examples)
+- **🏠 Local & Cloud**: Support for both cloud providers and local models via Ollama
-  - [Chat Interactions](#chat-interactions)
+- **⚡ Zero Lock-In**: Your code remains portable across all AI providers
  - [Streaming Chat](#streaming-chat)
  - [Audio Generation](#audio-generation)
  - [Document Processing](#document-processing)
  - [Vision Processing](#vision-processing)
 - [Error Handling](#error-handling)
 - [Development](#development)
  - [Running Tests](#running-tests)
  - [Building the Project](#building-the-project)
 - [Contributing](#contributing)
 - [License](#license)
 - [Legal Information](#legal-information)
-## Features
+## 🚀 Quick Start
 - **Unified API:** Seamlessly integrate multiple AI providers with a consistent interface.
 - **Chat & Streaming:** Support for both synchronous and real-time streaming chat interactions.
 - **Audio & Vision:** Generate audio responses and perform detailed image analysis.
 - **Document Processing:** Analyze PDFs and other documents using vision models.
 - **Extensible:** Easily extend the library to support additional AI providers.
 ## Installation
 To install SmartAi, run the following command:
 ```bash
 npm install @push.rocks/smartai
 ```
 This will add the package to your project’s dependencies.
 ## Supported AI Providers
 SmartAi supports multiple AI providers. Configure each provider with its corresponding token or settings:
 ### OpenAI
 - **Models:** GPT-4, GPT-3.5-turbo, GPT-4-vision-preview
 - **Features:** Chat, Streaming, Audio Generation, Vision, Document Processing
 - **Configuration Example:**
  ```typescript
  openaiToken: 'your-openai-token'
  ```
 ### X.AI
 - **Models:** Grok-2-latest
 - **Features:** Chat, Streaming, Document Processing
 - **Configuration Example:**
  ```typescript
  xaiToken: 'your-xai-token'
  ```
 ### Anthropic
 - **Models:** Claude-3-opus-20240229
 - **Features:** Chat, Streaming, Vision, Document Processing
 - **Configuration Example:**
  ```typescript
  anthropicToken: 'your-anthropic-token'
  ```
 ### Perplexity
 - **Models:** Mixtral-8x7b-instruct
 - **Features:** Chat, Streaming
 - **Configuration Example:**
  ```typescript
  perplexityToken: 'your-perplexity-token'
  ```
 ### Groq
 - **Models:** Llama-3.3-70b-versatile
 - **Features:** Chat, Streaming
 - **Configuration Example:**
  ```typescript
  groqToken: 'your-groq-token'
  ```
 ### Ollama
 - **Models:** Configurable (default: llama2; use llava for vision/document tasks)
 - **Features:** Chat, Streaming, Vision, Document Processing
 - **Configuration Example:**
  ```typescript
  ollama: {
    baseUrl: 'http://localhost:11434', // Optional
    model: 'llama2',                  // Optional
    visionModel: 'llava'               // Optional for vision and document tasks
  }
  ```
 ### Exo
 - **Models:** Configurable (supports LLaMA, Mistral, LlaVA, Qwen, and Deepseek)
 - **Features:** Chat, Streaming
 - **Configuration Example:**
  ```typescript
  exo: {
    baseUrl: 'http://localhost:8080/v1', // Optional
    apiKey: 'your-api-key'               // Optional for local deployments
  }
  ```
 ## Quick Start
 Initialize SmartAi with the provider configurations you plan to use:
 ```typescript
 import { SmartAi } from '@push.rocks/smartai';
-const smartAi = new SmartAi({
+// Initialize with your favorite providers
-  openaiToken: 'your-openai-token',
+const ai = new SmartAi({
-  xaiToken: 'your-xai-token',
+  openaiToken: 'sk-...',
-  anthropicToken: 'your-anthropic-token',
+  anthropicToken: 'sk-ant-...'
  perplexityToken: 'your-perplexity-token',
  groqToken: 'your-groq-token',
  ollama: {
    baseUrl: 'http://localhost:11434',
    model: 'llama2'
  },
  exo: {
    baseUrl: 'http://localhost:8080/v1',
    apiKey: 'your-api-key'
  }
 });
-await smartAi.start();
+await ai.start();
 ```
-## Usage Examples
+// Same API, multiple providers
-
+const response = await ai.openaiProvider.chat({
 ### Chat Interactions
 **Synchronous Chat:**
 ```typescript
 const response = await smartAi.openaiProvider.chat({
  systemMessage: 'You are a helpful assistant.',
-  userMessage: 'What is the capital of France?',
+  userMessage: 'Explain quantum computing in simple terms',
-  messageHistory: [] // Include previous conversation messages if applicable
+  messageHistory: []
 });
 console.log(response.message);
 ```
-### Streaming Chat
+## 📊 Provider Capabilities Matrix
-**Real-Time Streaming:**
+Choose the right provider for your use case:
 | Provider | Chat | Streaming | TTS | Vision | Documents | Highlights |
 |----------|:----:|:---------:|:---:|:------:|:---------:|------------|
 | **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | • GPT-4, DALL-E 3<br>• Industry standard<br>• Most features |
 | **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | • Claude 3 Opus<br>• Superior reasoning<br>• 200k context |
 | **Ollama** | ✅ | ✅ | ❌ | ✅ | ✅ | • 100% local<br>• Privacy-first<br>• No API costs |
 | **XAI** | ✅ | ✅ | ❌ | ❌ | ✅ | • Grok models<br>• Real-time data<br>• Uncensored |
 | **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | • Web-aware<br>• Research-focused<br>• Citations |
 | **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | • 10x faster<br>• LPU inference<br>• Low latency |
 | **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | • Distributed<br>• P2P compute<br>• Decentralized |
 ## 🎮 Core Features
 ### 💬 Universal Chat Interface
 Works identically across all providers:
 ```typescript
-const textEncoder = new TextEncoder();
+// Use GPT-4 for complex reasoning
-const textDecoder = new TextDecoder();
+const gptResponse = await ai.openaiProvider.chat({
  systemMessage: 'You are a expert physicist.',
  userMessage: 'Explain the implications of quantum entanglement',
  messageHistory: []
 });
-// Create a transform stream for sending and receiving data
+// Use Claude for safety-critical applications
-const { writable, readable } = new TransformStream();
+const claudeResponse = await ai.anthropicProvider.chat({
-const writer = writable.getWriter();
+  systemMessage: 'You are a medical advisor.',
  userMessage: 'Review this patient data for concerns',
  messageHistory: []
 });
-const message = {
+// Use Groq for lightning-fast responses
-  role: 'user',
+const groqResponse = await ai.groqProvider.chat({
-  content: 'Tell me a story about a brave knight'
+  systemMessage: 'You are a code reviewer.',
-};
+  userMessage: 'Quick! Find the bug in this code: ...',
  messageHistory: []
 });
 ```
-writer.write(textEncoder.encode(JSON.stringify(message) + '\n'));
+### 🌊 Real-Time Streaming
-// Start streaming the response
+Build responsive chat interfaces with token-by-token streaming:
-const stream = await smartAi.openaiProvider.chatStream(readable);
+
 ```typescript
 // Create a chat stream
 const stream = await ai.openaiProvider.chatStream(inputStream);
 const reader = stream.getReader();
 // Display responses as they arrive
 while (true) {
  const { done, value } = await reader.read();
  if (done) break;
-  console.log('AI:', value);
+  
  // Update UI in real-time
  process.stdout.write(value);
 }
 ```
-### Audio Generation
+### 🎙️ Text-to-Speech
-Generate audio (supported by providers like OpenAI):
+Generate natural voices with OpenAI:
 ```typescript
-const audioStream = await smartAi.openaiProvider.audio({
+const audioStream = await ai.openaiProvider.audio({
-  message: 'Hello, this is a test of text-to-speech'
+  message: 'Welcome to the future of AI development!'
 });
-// Process the audio stream, for example, play it or save to a file.
+// Stream directly to speakers
 audioStream.pipe(speakerOutput);
 // Or save to file
 audioStream.pipe(fs.createWriteStream('welcome.mp3'));
 ```
-### Document Processing
+### 👁️ Vision Analysis
-Analyze and extract key information from documents:
+Understand images with multiple providers:
 ```typescript
-// Example using OpenAI
+const image = fs.readFileSync('product-photo.jpg');
-const documentResult = await smartAi.openaiProvider.document({
+
-  systemMessage: 'Classify the document type',
+// OpenAI: General purpose vision
-  userMessage: 'What type of document is this?',
+const gptVision = await ai.openaiProvider.vision({
  image,
  prompt: 'Describe this product and suggest marketing angles'
 });
 // Anthropic: Detailed analysis
 const claudeVision = await ai.anthropicProvider.vision({
  image,
  prompt: 'Identify any safety concerns or defects'
 });
 // Ollama: Private, local analysis
 const ollamaVision = await ai.ollamaProvider.vision({
  image,
  prompt: 'Extract all text and categorize the content'
 });
 ```
 ### 📄 Document Intelligence
 Extract insights from PDFs with AI:
 ```typescript
 const contract = fs.readFileSync('contract.pdf');
 const invoice = fs.readFileSync('invoice.pdf');
 // Analyze documents
 const analysis = await ai.openaiProvider.document({
  systemMessage: 'You are a legal expert.',
  userMessage: 'Compare these documents and highlight key differences',
  messageHistory: [],
-  pdfDocuments: [pdfBuffer] // Uint8Array containing the PDF content
+  pdfDocuments: [contract, invoice]
 });
 ```
-Other providers (e.g., Ollama and Anthropic) follow a similar pattern:
+// Multi-document analysis
-
+const taxDocs = [form1099, w2, receipts];
-```typescript
+const taxAnalysis = await ai.anthropicProvider.document({
-// Using Ollama for document processing
+  systemMessage: 'You are a tax advisor.',
-const ollamaResult = await smartAi.ollamaProvider.document({
+  userMessage: 'Prepare a tax summary from these documents',
  systemMessage: 'You are a document analysis assistant',
  userMessage: 'Extract key information from this document',
  messageHistory: [],
-  pdfDocuments: [pdfBuffer]
+  pdfDocuments: taxDocs
 });
 ```
 ### 🔄 Persistent Conversations
 Maintain context across interactions:
 ```typescript
-// Using Anthropic for document processing
+// Create a coding assistant conversation
-const anthropicResult = await smartAi.anthropicProvider.document({
+const assistant = ai.createConversation('openai');
-  systemMessage: 'Analyze the document',
+await assistant.setSystemMessage('You are an expert TypeScript developer.');
-  userMessage: 'Please extract the main points',
+
-  messageHistory: [],
+// First question
-  pdfDocuments: [pdfBuffer]
+const inputWriter = assistant.getInputStreamWriter();
-});
+await inputWriter.write('How do I implement a singleton pattern?');
 // Continue the conversation
 await inputWriter.write('Now show me how to make it thread-safe');
 // The assistant remembers the entire context
 ```
-### Vision Processing
+## 🚀 Real-World Examples
-Analyze images with vision capabilities:
+### Build a Customer Support Bot
 ```typescript
-// Using OpenAI GPT-4 Vision
+const supportBot = new SmartAi({
-const imageDescription = await smartAi.openaiProvider.vision({
+  anthropicToken: process.env.ANTHROPIC_KEY // Claude for empathetic responses
  image: imageBuffer, // Uint8Array containing image data
  prompt: 'What do you see in this image?'
 });
-// Using Ollama for vision tasks
+async function handleCustomerQuery(query: string, history: ChatMessage[]) {
-const ollamaImageAnalysis = await smartAi.ollamaProvider.vision({
+  try {
-  image: imageBuffer,
+    const response = await supportBot.anthropicProvider.chat({
-  prompt: 'Analyze this image in detail'
+      systemMessage: `You are a helpful customer support agent. 
-});
+                      Be empathetic, professional, and solution-oriented.`,
-
+      userMessage: query,
-// Using Anthropic for vision analysis
+      messageHistory: history
-const anthropicImageAnalysis = await smartAi.anthropicProvider.vision({
+    });
-  image: imageBuffer,
+    
-  prompt: 'Describe the contents of this image'
+    return response.message;
-});
+  } catch (error) {
    // Fallback to another provider if needed
    return await supportBot.openaiProvider.chat({...});
  }
 }
 ```
-## Error Handling
+### Create a Code Review Assistant
 Always wrap API calls in try-catch blocks to manage errors effectively:
 ```typescript
-try {
+const codeReviewer = new SmartAi({
-  const response = await smartAi.openaiProvider.chat({
+  groqToken: process.env.GROQ_KEY // Groq for speed
-    systemMessage: 'You are a helpful assistant.',
+});
-    userMessage: 'Hello!',
+
 async function reviewCode(code: string, language: string) {
  const startTime = Date.now();
  const review = await codeReviewer.groqProvider.chat({
    systemMessage: `You are a ${language} expert. Review code for:
                    - Security vulnerabilities
                    - Performance issues  
                    - Best practices
                    - Potential bugs`,
    userMessage: `Review this code:\n\n${code}`,
    messageHistory: []
  });
-  console.log(response.message);
+  
-} catch (error: any) {
+  console.log(`Review completed in ${Date.now() - startTime}ms`);
-  console.error('AI provider error:', error.message);
+  return review.message;
 }
 ```
-## Development
+### Build a Research Assistant
-### Running Tests
+```typescript
 const researcher = new SmartAi({
  perplexityToken: process.env.PERPLEXITY_KEY
 });
-To run the test suite, use the following command:
+async function research(topic: string) {
-
+  // Perplexity excels at web-aware research
-```bash
+  const findings = await researcher.perplexityProvider.chat({
-npm run test
+    systemMessage: 'You are a research assistant. Provide factual, cited information.',
    userMessage: `Research the latest developments in ${topic}`,
    messageHistory: []
  });
  return findings.message;
 }
 ```
-Ensure your environment is configured with the appropriate tokens and settings for the providers you are testing.
+### Local AI for Sensitive Data
-### Building the Project
+```typescript
 const localAI = new SmartAi({
  ollama: {
    baseUrl: 'http://localhost:11434',
    model: 'llama2',
    visionModel: 'llava'
  }
 });
-Compile the TypeScript code and build the package using:
+// Process sensitive documents without leaving your infrastructure
-
+async function analyzeSensitiveDoc(pdfBuffer: Buffer) {
-```bash
+  const analysis = await localAI.ollamaProvider.document({
-npm run build
+    systemMessage: 'Extract and summarize key information.',
    userMessage: 'Analyze this confidential document',
    messageHistory: [],
    pdfDocuments: [pdfBuffer]
  });
  // Data never leaves your servers
  return analysis.message;
 }
 ```
-This command prepares the library for distribution.
+## ⚡ Performance Tips
-## Contributing
+### 1. Provider Selection Strategy
-Contributions are welcome! Please follow these steps:
+```typescript
 class SmartAIRouter {
  constructor(private ai: SmartAi) {}
  async query(message: string, requirements: {
    speed?: boolean;
    accuracy?: boolean;
    cost?: boolean;
    privacy?: boolean;
  }) {
    if (requirements.privacy) {
      return this.ai.ollamaProvider.chat({...}); // Local only
    }
    if (requirements.speed) {
      return this.ai.groqProvider.chat({...}); // 10x faster
    }
    if (requirements.accuracy) {
      return this.ai.anthropicProvider.chat({...}); // Best reasoning
    }
    // Default fallback
    return this.ai.openaiProvider.chat({...});
  }
 }
 ```
-1. Fork the repository.
+### 2. Streaming for Large Responses
-2. Create a feature branch:  
+
-   ```bash
+```typescript
-   git checkout -b feature/my-feature
+// Don't wait for the entire response
-   ```
+async function streamResponse(userQuery: string) {
-3. Commit your changes with clear messages:  
+  const stream = await ai.openaiProvider.chatStream(createInputStream(userQuery));
-   ```bash
+  
-   git commit -m 'Add new feature'
+  // Process tokens as they arrive
-   ```
+  for await (const chunk of stream) {
-4. Push your branch to your fork:  
+    updateUI(chunk); // Immediate feedback
-   ```bash
+    await processChunk(chunk); // Parallel processing
-   git push origin feature/my-feature
+  }
-   ```
+}
-5. Open a Pull Request with a detailed description of your changes.
+```
 ### 3. Parallel Multi-Provider Queries
 ```typescript
 // Get the best answer from multiple AIs
 async function consensusQuery(question: string) {
  const providers = [
    ai.openaiProvider.chat({...}),
    ai.anthropicProvider.chat({...}),
    ai.perplexityProvider.chat({...})
  ];
  const responses = await Promise.all(providers);
  return synthesizeResponses(responses);
 }
 ```
 ## 🛠️ Advanced Features
 ### Custom Streaming Transformations
 ```typescript
 // Add real-time translation
 const translationStream = new TransformStream({
  async transform(chunk, controller) {
    const translated = await translateChunk(chunk);
    controller.enqueue(translated);
  }
 });
 const responseStream = await ai.openaiProvider.chatStream(input);
 const translatedStream = responseStream.pipeThrough(translationStream);
 ```
 ### Error Handling & Fallbacks
 ```typescript
 class ResilientAI {
  private providers = ['openai', 'anthropic', 'groq'];
  async query(opts: ChatOptions): Promise<ChatResponse> {
    for (const provider of this.providers) {
      try {
        return await this.ai[`${provider}Provider`].chat(opts);
      } catch (error) {
        console.warn(`${provider} failed, trying next...`);
        continue;
      }
    }
    throw new Error('All providers failed');
  }
 }
 ```
 ### Token Counting & Cost Management
 ```typescript
 // Track usage across providers
 class UsageTracker {
  async trackedChat(provider: string, options: ChatOptions) {
    const start = Date.now();
    const response = await ai[`${provider}Provider`].chat(options);
    const usage = {
      provider,
      duration: Date.now() - start,
      inputTokens: estimateTokens(options),
      outputTokens: estimateTokens(response.message)
    };
    await this.logUsage(usage);
    return response;
  }
 }
 ```
 ## 📦 Installation & Setup
 ### Prerequisites
 - Node.js 16+ 
 - TypeScript 4.5+
 - API keys for your chosen providers
 ### Environment Setup
 ```bash
 # Install
 npm install @push.rocks/smartai
 # Set up environment variables
 export OPENAI_API_KEY=sk-...
 export ANTHROPIC_API_KEY=sk-ant-...
 export PERPLEXITY_API_KEY=pplx-...
 # ... etc
 ```
 ### TypeScript Configuration
 ```json
 {
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "lib": ["ES2022"],
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true
  }
 }
 ```
 ## 🎯 Choosing the Right Provider
 | Use Case | Recommended Provider | Why |
 |----------|---------------------|-----|
 | **General Purpose** | OpenAI | Most features, stable, well-documented |
 | **Complex Reasoning** | Anthropic | Superior logical thinking, safer outputs |
 | **Research & Facts** | Perplexity | Web-aware, provides citations |
 | **Speed Critical** | Groq | 10x faster inference, sub-second responses |
 | **Privacy Critical** | Ollama | 100% local, no data leaves your servers |
 | **Real-time Data** | XAI | Access to current information |
 | **Cost Sensitive** | Ollama/Exo | Free (local) or distributed compute |
 ## 🤝 Contributing
 SmartAI is open source and welcomes contributions! Visit our [GitHub repository](https://code.foss.global/push.rocks/smartai) to:
 - Report issues
 - Submit pull requests
 - Request features
 - Join discussions
 ## 📈 Roadmap
 - [ ] Streaming function calls
 - [ ] Image generation support
 - [ ] Voice input processing
 - [ ] Fine-tuning integration
 - [ ] Embedding support
 - [ ] Agent framework
 - [ ] More providers (Cohere, AI21, etc.)
 ## License and Legal Information
--- a/test/test.ts
+++ b/test/test.ts
@@ -1,4 +1,4 @@
-import { expect, expectAsync, tap } from '@push.rocks/tapbundle';
+import { expect, tap } from '@push.rocks/tapbundle';
 import * as qenv from '@push.rocks/qenv';
 import * as smartrequest from '@push.rocks/smartrequest';
 import * as smartfile from '@push.rocks/smartfile';
@@ -21,8 +21,7 @@ tap.test('should create chat response with openai', async () => {
  const response = await testSmartai.openaiProvider.chat({
    systemMessage: 'Hello',
    userMessage: userMessage,
-    messageHistory: [
+    messageHistory: [],
    ],
  });
  console.log(`userMessage: ${userMessage}`);
  console.log(response.message);
@@ -30,12 +29,14 @@ tap.test('should create chat response with openai', async () => {
 tap.test('should document a pdf', async () => {
  const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
-  const pdfResponse = await smartrequest.getBinary(pdfUrl);
+  const pdfResponse = await smartrequest.SmartRequest.create()
    .url(pdfUrl)
    .get();
  const result = await testSmartai.openaiProvider.document({
    systemMessage: 'Classify the document. Only the following answers are allowed: "invoice", "bank account statement", "contract", "other". The answer should only contain the keyword for machine use.',
    userMessage: "Classify the document.",
    messageHistory: [],
-    pdfDocuments: [pdfResponse.body],
+    pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())],
  });
  console.log(result);
 });
@@ -55,7 +56,7 @@ tap.test('should recognize companies in a pdf', async () => {
            address: string;
            city: string;
            country: string;
-            EU: boolean; // wether the entity is within EU
+            EU: boolean; // whether the entity is within EU
          };
          entityReceiver: {
            type: 'official state entity' | 'company' | 'person';
@@ -63,7 +64,7 @@ tap.test('should recognize companies in a pdf', async () => {
            address: string;
            city: string;
            country: string;
-            EU: boolean; // wether the entity is within EU
+            EU: boolean; // whether the entity is within EU
          };
          date: string; // the date of the document as YYYY-MM-DD
          title: string; // a short title, suitable for a filename
@@ -75,10 +76,27 @@ tap.test('should recognize companies in a pdf', async () => {
    pdfDocuments: [pdfBuffer],
  });
  console.log(result);
-})
+});
 tap.test('should create audio response with openai', async () => {
  // Call the audio method with a sample message.
  const audioStream = await testSmartai.openaiProvider.audio({
    message: 'This is a test of audio generation.',
  });
  // Read all chunks from the stream.
  const chunks: Uint8Array[] = [];
  for await (const chunk of audioStream) {
    chunks.push(chunk as Uint8Array);
  }
  const audioBuffer = Buffer.concat(chunks);
  await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput.mp3');
  console.log(`Audio Buffer length: ${audioBuffer.length}`);
  // Assert that the resulting buffer is not empty.
  expect(audioBuffer.length).toBeGreaterThan(0);
 });
 tap.test('should stop the smartai instance', async () => {
  await testSmartai.stop();
 });
-export default tap.start();
+export default tap.start();
--- a/ts/00_commitinfo_data.ts
+++ b/ts/00_commitinfo_data.ts
@@ -3,6 +3,6 @@
 */
 export const commitinfo = {
  name: '@push.rocks/smartai',
-  version: '0.4.1',
+  version: '0.5.4',
-  description: 'A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.'
+  description: 'SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.'
 }
--- a/ts/abstract.classes.multimodal.ts
+++ b/ts/abstract.classes.multimodal.ts
@@ -1,3 +1,5 @@
 import * as plugins from './plugins.js';
 /**
 * Message format for chat interactions
 */
@@ -28,17 +30,30 @@ export interface ChatResponse {
 * Provides a common interface for different AI providers (OpenAI, Anthropic, Perplexity, Ollama)
 */
 export abstract class MultiModalModel {
  /**
   * SmartPdf instance for document processing
   * Shared across all methods that need PDF functionality
   */
  protected smartpdfInstance: plugins.smartpdf.SmartPdf;
  /**
   * Initializes the model and any necessary resources
   * Should be called before using any other methods
   */
-  abstract start(): Promise<void>;
+  public async start(): Promise<void> {
    this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
    await this.smartpdfInstance.start();
  }
  /**
   * Cleans up any resources used by the model
   * Should be called when the model is no longer needed
   */
-  abstract stop(): Promise<void>;
+  public async stop(): Promise<void> {
    if (this.smartpdfInstance) {
      await this.smartpdfInstance.stop();
    }
  }
  /**
   * Synchronous chat interaction with the model
--- a/ts/classes.smartai.ts
+++ b/ts/classes.smartai.ts
@@ -91,7 +91,29 @@ export class SmartAi {
    }
  }
-  public async stop() {}
+  public async stop() {
    if (this.openaiProvider) {
      await this.openaiProvider.stop();
    }
    if (this.anthropicProvider) {
      await this.anthropicProvider.stop();
    }
    if (this.perplexityProvider) {
      await this.perplexityProvider.stop();
    }
    if (this.groqProvider) {
      await this.groqProvider.stop();
    }
    if (this.xaiProvider) {
      await this.xaiProvider.stop();
    }
    if (this.ollamaProvider) {
      await this.ollamaProvider.stop();
    }
    if (this.exoProvider) {
      await this.exoProvider.stop();
    }
  }
  /**
   * create a new conversation
--- a/ts/provider.anthropic.ts
+++ b/ts/provider.anthropic.ts
@@ -20,12 +20,15 @@ export class AnthropicProvider extends MultiModalModel {
  }
  async start() {
    await super.start();
    this.anthropicApiClient = new plugins.anthropic.default({
      apiKey: this.options.anthropicToken,
    });
  }
-  async stop() {}
+  async stop() {
    await super.stop();
  }
  public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
    // Create a TextDecoder to handle incoming chunks
@@ -178,11 +181,10 @@ export class AnthropicProvider extends MultiModalModel {
    messageHistory: ChatMessage[];
  }): Promise<{ message: any }> {
    // Convert PDF documents to images using SmartPDF
    const smartpdfInstance = new plugins.smartpdf.SmartPdf();
    let documentImageBytesArray: Uint8Array[] = [];
    for (const pdfDocument of optionsArg.pdfDocuments) {
-      const documentImageArray = await smartpdfInstance.convertPDFToPngBytes(pdfDocument);
+      const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
      documentImageBytesArray = documentImageBytesArray.concat(documentImageArray);
    }
--- a/ts/provider.ollama.ts
+++ b/ts/provider.ollama.ts
@@ -24,6 +24,7 @@ export class OllamaProvider extends MultiModalModel {
  }
  async start() {
    await super.start();
    // Verify Ollama is running
    try {
      const response = await fetch(`${this.baseUrl}/api/tags`);
@@ -35,7 +36,9 @@ export class OllamaProvider extends MultiModalModel {
    }
  }
-  async stop() {}
+  async stop() {
    await super.stop();
  }
  public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
    // Create a TextDecoder to handle incoming chunks
@@ -205,11 +208,10 @@ export class OllamaProvider extends MultiModalModel {
    messageHistory: ChatMessage[];
  }): Promise<{ message: any }> {
    // Convert PDF documents to images using SmartPDF
    const smartpdfInstance = new plugins.smartpdf.SmartPdf();
    let documentImageBytesArray: Uint8Array[] = [];
    for (const pdfDocument of optionsArg.pdfDocuments) {
-      const documentImageArray = await smartpdfInstance.convertPDFToPngBytes(pdfDocument);
+      const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
      documentImageBytesArray = documentImageBytesArray.concat(documentImageArray);
    }
--- a/ts/provider.openai.ts
+++ b/ts/provider.openai.ts
@@ -1,5 +1,6 @@
 import * as plugins from './plugins.js';
 import * as paths from './paths.js';
 import { Readable } from 'stream';
 // Custom type definition for chat completion messages
 export type TChatCompletionRequestMessage = { 
@@ -20,7 +21,6 @@ export interface IOpenaiProviderOptions {
 export class OpenAiProvider extends MultiModalModel {
  private options: IOpenaiProviderOptions;
  public openAiApiClient: plugins.openai.default;
  public smartpdfInstance: plugins.smartpdf.SmartPdf;
  constructor(optionsArg: IOpenaiProviderOptions) {
    super();
@@ -28,14 +28,16 @@ export class OpenAiProvider extends MultiModalModel {
  }
  public async start() {
    await super.start();
    this.openAiApiClient = new plugins.openai.default({
      apiKey: this.options.openaiToken,
      dangerouslyAllowBrowser: true,
    });
    this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
  }
-  public async stop() {}
+  public async stop() {
    await super.stop();
  }
  public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
    // Create a TextDecoder to handle incoming chunks
@@ -75,21 +77,23 @@ export class OpenAiProvider extends MultiModalModel {
        // If we have a complete message, send it to OpenAI
        if (currentMessage) {
          const messageToSend = { role: "user" as const, content: currentMessage.content };
-          const stream = await this.openAiApiClient.chat.completions.create({
+          const chatModel = this.options.chatModel ?? 'o3-mini';
-            model: this.options.chatModel ?? 'o3-mini',
+          const requestParams: any = {
-            temperature: 0,
+            model: chatModel,
            messages: [messageToSend],
            stream: true,
-          });
+          };
-
+          // Temperature is omitted since the model does not support it.
          const stream = await this.openAiApiClient.chat.completions.create(requestParams);
          // Explicitly cast the stream as an async iterable to satisfy TypeScript.
          const streamAsyncIterable = stream as unknown as AsyncIterableIterator<any>;
          // Process each chunk from OpenAI
-          for await (const chunk of stream) {
+          for await (const chunk of streamAsyncIterable) {
            const content = chunk.choices[0]?.delta?.content;
            if (content) {
              controller.enqueue(content);
            }
          }
          currentMessage = null;
        }
      },
@@ -119,15 +123,17 @@ export class OpenAiProvider extends MultiModalModel {
      content: string;
    }[];
  }) {
-    const result = await this.openAiApiClient.chat.completions.create({
+    const chatModel = this.options.chatModel ?? 'o3-mini';
-      model: this.options.chatModel ?? 'o3-mini',
+    const requestParams: any = {
-      temperature: 0,
+      model: chatModel,
      messages: [
        { role: 'system', content: optionsArg.systemMessage },
        ...optionsArg.messageHistory,
        { role: 'user', content: optionsArg.userMessage },
      ],
-    });
+    };
    // Temperature parameter removed to avoid unsupported error.
    const result = await this.openAiApiClient.chat.completions.create(requestParams);
    return {
      role: result.choices[0].message.role as 'assistant',
      message: result.choices[0].message.content,
@@ -137,14 +143,15 @@ export class OpenAiProvider extends MultiModalModel {
  public async audio(optionsArg: { message: string }): Promise<NodeJS.ReadableStream> {
    const done = plugins.smartpromise.defer<NodeJS.ReadableStream>();
    const result = await this.openAiApiClient.audio.speech.create({
-      model: this.options.audioModel ?? 'o3-mini',
+      model: this.options.audioModel ?? 'tts-1-hd',
      input: optionsArg.message,
      voice: 'nova',
      response_format: 'mp3',
      speed: 1,
    });
    const stream = result.body;
-    done.resolve(stream);
+    const nodeStream = Readable.fromWeb(stream as any);
    done.resolve(nodeStream);
    return done.promise;
  }
@@ -159,6 +166,7 @@ export class OpenAiProvider extends MultiModalModel {
  }) {
    let pdfDocumentImageBytesArray: Uint8Array[] = [];
    // Convert each PDF into one or more image byte arrays.
    for (const pdfDocument of optionsArg.pdfDocuments) {
      const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
      pdfDocumentImageBytesArray = pdfDocumentImageBytesArray.concat(documentImageArray);
@@ -167,19 +175,18 @@ export class OpenAiProvider extends MultiModalModel {
    console.log(`image smartfile array`);
    console.log(pdfDocumentImageBytesArray.map((smartfile) => smartfile.length));
-    const smartfileArray = await plugins.smartarray.map(
+    // Filter out any empty buffers to avoid sending invalid image URLs.
-      pdfDocumentImageBytesArray,
+    const validImageBytesArray = pdfDocumentImageBytesArray.filter(imageBytes => imageBytes && imageBytes.length > 0);
-      async (pdfDocumentImageBytes) => {
+    const imageAttachments = validImageBytesArray.map(imageBytes => ({
-        return plugins.smartfile.SmartFile.fromBuffer(
+      type: 'image_url',
-          'pdfDocumentImage.jpg',
+      image_url: {
-          Buffer.from(pdfDocumentImageBytes)
+        url: 'data:image/png;base64,' + Buffer.from(imageBytes).toString('base64'),
-        );
+      },
-      }
+    }));
    );
-    const result = await this.openAiApiClient.chat.completions.create({
+    const chatModel = this.options.chatModel ?? 'o4-mini';
-      model: this.options.chatModel ?? 'o3-mini',
+    const requestParams: any = {
-      temperature: 0,
+      model: chatModel,
      messages: [
        { role: 'system', content: optionsArg.systemMessage },
        ...optionsArg.messageHistory,
@@ -187,31 +194,22 @@ export class OpenAiProvider extends MultiModalModel {
          role: 'user',
          content: [
            { type: 'text', text: optionsArg.userMessage },
-            ...(() => {
+            ...imageAttachments,
              const returnArray = [];
              for (const imageBytes of pdfDocumentImageBytesArray) {
                returnArray.push({
                  type: 'image_url',
                  image_url: {
                    url: 'data:image/png;base64,' + Buffer.from(imageBytes).toString('base64'),
                  },
                });
              }
              return returnArray;
            })(),
          ],
        },
      ],
-    });
+    };
    // Temperature parameter removed.
    const result = await this.openAiApiClient.chat.completions.create(requestParams);
    return {
      message: result.choices[0].message,
    };
  }
  public async vision(optionsArg: { image: Buffer; prompt: string }): Promise<string> {
-    const result = await this.openAiApiClient.chat.completions.create({
+    const visionModel = this.options.visionModel ?? '04-mini';
-      model: this.options.visionModel ?? 'o3-mini',
+    const requestParams: any = {
-      temperature: 0,
+      model: visionModel,
      messages: [
        {
          role: 'user',
@@ -227,8 +225,8 @@ export class OpenAiProvider extends MultiModalModel {
        }
      ],
      max_tokens: 300
-    });
+    };
-
+    const result = await this.openAiApiClient.chat.completions.create(requestParams);
    return result.choices[0].message.content || '';
  }
 }
--- a/ts/provider.xai.ts
+++ b/ts/provider.xai.ts
@@ -11,7 +11,6 @@ export interface IXAIProviderOptions {
 export class XAIProvider extends MultiModalModel {
  private options: IXAIProviderOptions;
  public openAiApiClient: plugins.openai.default;
  public smartpdfInstance: plugins.smartpdf.SmartPdf;
  constructor(optionsArg: IXAIProviderOptions) {
    super();
@@ -19,14 +18,16 @@ export class XAIProvider extends MultiModalModel {
  }
  public async start() {
    await super.start();
    this.openAiApiClient = new plugins.openai.default({
      apiKey: this.options.xaiToken,
      baseURL: 'https://api.x.ai/v1',
    });
    this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
  }
-  public async stop() {}
+  public async stop() {
    await super.stop();
  }
  public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
    // Create a TextDecoder to handle incoming chunks
Author	SHA1	Message	Date
Juergen Kunz	0b2a058550	fix(core): improve SmartPdf lifecycle management and update dependencies Some checks failed Default (tags) / security (push) Failing after 19s Details Default (tags) / test (push) Failing after 16s Details Default (tags) / release (push) Has been skipped Details Default (tags) / metadata (push) Has been skipped Details	2025-08-01 18:25:46 +00:00
Juergen Kunz	88d15c89e5	0.5.6 Some checks failed Default (tags) / security (push) Failing after 24s Details Default (tags) / test (push) Failing after 13s Details Default (tags) / release (push) Has been skipped Details Default (tags) / metadata (push) Has been skipped Details	2025-07-26 16:17:11 +00:00
Juergen Kunz	4bf7113334	feat(documentation): comprehensive documentation enhancement and test improvements Some checks failed Default (tags) / security (push) Failing after 25s Details Default (tags) / test (push) Failing after 12s Details Default (tags) / release (push) Has been skipped Details Default (tags) / metadata (push) Has been skipped Details	2025-07-25 18:00:23 +00:00
Philipp Kunz	6bdbeae144	0.5.4	2025-05-13 18:39:58 +00:00
Philipp Kunz	09c27379cb	fix(provider.openai): Update dependency versions, clean test imports, and adjust default OpenAI model configurations	2025-05-13 18:39:57 +00:00
Philipp Kunz	2bc6f7ee5e	0.5.3	2025-04-03 21:46:40 +00:00
Philipp Kunz	0ac50d647d	fix(package.json): Add explicit packageManager field to package.json	2025-04-03 21:46:40 +00:00
Philipp Kunz	5f9ffc7356	0.5.2	2025-04-03 21:46:15 +00:00
Philipp Kunz	502b665224	fix(readme): Remove redundant conclusion section from README to streamline documentation.	2025-04-03 21:46:14 +00:00
Philipp Kunz	bda0d7ed7e	0.5.1	2025-02-25 19:15:32 +00:00
Philipp Kunz	de2a60d12f	fix(OpenAiProvider): Corrected audio model ID in OpenAiProvider	2025-02-25 19:15:32 +00:00
Philipp Kunz	5b3a93a43a	0.5.0	2025-02-25 19:04:40 +00:00
Philipp Kunz	6b241f8889	feat(documentation and configuration): Enhanced package and README documentation	2025-02-25 19:04:40 +00:00
Philipp Kunz	0a80ac0a8a	0.4.2	2025-02-25 18:23:28 +00:00
Philipp Kunz	6ce442354e	fix(core): Fix OpenAI chat streaming and PDF document processing logic.	2025-02-25 18:23:28 +00:00