20 Commits

Author SHA1 Message Date
Juergen Kunz
574f7a594c fix(documentation): remove contribution section from readme
Some checks failed
Default (tags) / security (push) Failing after 23s
Default (tags) / test (push) Failing after 12s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-08-01 18:37:26 +00:00
Juergen Kunz
0b2a058550 fix(core): improve SmartPdf lifecycle management and update dependencies
Some checks failed
Default (tags) / security (push) Failing after 19s
Default (tags) / test (push) Failing after 16s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-08-01 18:25:46 +00:00
Juergen Kunz
88d15c89e5 0.5.6
Some checks failed
Default (tags) / security (push) Failing after 24s
Default (tags) / test (push) Failing after 13s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-07-26 16:17:11 +00:00
Juergen Kunz
4bf7113334 feat(documentation): comprehensive documentation enhancement and test improvements
Some checks failed
Default (tags) / security (push) Failing after 25s
Default (tags) / test (push) Failing after 12s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-07-25 18:00:23 +00:00
6bdbeae144 0.5.4 2025-05-13 18:39:58 +00:00
09c27379cb fix(provider.openai): Update dependency versions, clean test imports, and adjust default OpenAI model configurations 2025-05-13 18:39:57 +00:00
2bc6f7ee5e 0.5.3 2025-04-03 21:46:40 +00:00
0ac50d647d fix(package.json): Add explicit packageManager field to package.json 2025-04-03 21:46:40 +00:00
5f9ffc7356 0.5.2 2025-04-03 21:46:15 +00:00
502b665224 fix(readme): Remove redundant conclusion section from README to streamline documentation. 2025-04-03 21:46:14 +00:00
bda0d7ed7e 0.5.1 2025-02-25 19:15:32 +00:00
de2a60d12f fix(OpenAiProvider): Corrected audio model ID in OpenAiProvider 2025-02-25 19:15:32 +00:00
5b3a93a43a 0.5.0 2025-02-25 19:04:40 +00:00
6b241f8889 feat(documentation and configuration): Enhanced package and README documentation 2025-02-25 19:04:40 +00:00
0a80ac0a8a 0.4.2 2025-02-25 18:23:28 +00:00
6ce442354e fix(core): Fix OpenAI chat streaming and PDF document processing logic. 2025-02-25 18:23:28 +00:00
9b38a3c06e 0.4.1 2025-02-25 13:01:23 +00:00
5dead05324 fix(provider): Fix provider modules for consistency 2025-02-25 13:01:23 +00:00
6916dd9e2a 0.4.0 2025-02-08 12:08:14 +01:00
f89888a542 feat(core): Added support for Exo AI provider 2025-02-08 12:08:14 +01:00
16 changed files with 7443 additions and 2009 deletions

View File

@@ -1,5 +1,87 @@
# Changelog # Changelog
## 2025-08-01 - 0.5.9 - fix(documentation)
Remove contribution section from readme
- Removed the contribution section from readme.md as requested
- Kept the roadmap section for future development plans
## 2025-08-01 - 0.5.8 - fix(core)
Fix SmartPdf lifecycle management and update dependencies
- Moved SmartPdf instance management to the MultiModalModel base class for better resource sharing
- Fixed memory leaks by properly implementing cleanup in the base class stop() method
- Updated SmartAi class to properly stop all providers on shutdown
- Updated @push.rocks/smartrequest from v2.1.0 to v4.2.1 with migration to new API
- Enhanced readme with professional documentation and feature matrix
## 2025-07-26 - 0.5.7 - fix(provider.openai)
Fix stream type mismatch in audio method
- Fixed type error where OpenAI SDK returns a web ReadableStream but the audio method needs to return a Node.js ReadableStream
- Added conversion using Node.js's built-in Readable.fromWeb() method
## 2025-07-25 - 0.5.5 - feat(documentation)
Comprehensive documentation enhancement and test improvements
- Completely rewrote readme.md with detailed provider comparisons, advanced usage examples, and performance tips
- Added comprehensive examples for all supported providers (OpenAI, Anthropic, Perplexity, Groq, XAI, Ollama, Exo)
- Included detailed sections on chat interactions, streaming, TTS, vision processing, and document analysis
- Added verbose flag to test script for better debugging
## 2025-05-13 - 0.5.4 - fix(provider.openai)
Update dependency versions, clean test imports, and adjust default OpenAI model configurations
- Bump dependency versions in package.json (@git.zone/tsbuild, @push.rocks/tapbundle, openai, etc.)
- Change default chatModel from 'gpt-4o' to 'o4-mini' and visionModel from 'gpt-4o' to '04-mini' in provider.openai.ts
- Remove unused 'expectAsync' import from test file
## 2025-04-03 - 0.5.3 - fix(package.json)
Add explicit packageManager field to package.json
- Include the packageManager property to specify the pnpm version and checksum.
- Align package metadata with current standards.
## 2025-04-03 - 0.5.2 - fix(readme)
Remove redundant conclusion section from README to streamline documentation.
- Eliminated the conclusion block describing SmartAi's capabilities and documentation pointers.
## 2025-02-25 - 0.5.1 - fix(OpenAiProvider)
Corrected audio model ID in OpenAiProvider
- Fixed audio model identifier from 'o3-mini' to 'tts-1-hd' in the OpenAiProvider's audio method.
- Addressed minor code formatting issues in test suite for better readability.
- Corrected spelling errors in test documentation and comments.
## 2025-02-25 - 0.5.0 - feat(documentation and configuration)
Enhanced package and README documentation
- Expanded the package description to better reflect the library's capabilities.
- Improved README with detailed usage examples for initialization, chat interactions, streaming chat, audio generation, document analysis, and vision processing.
- Provided error handling strategies and advanced streaming customization examples.
## 2025-02-25 - 0.4.2 - fix(core)
Fix OpenAI chat streaming and PDF document processing logic.
- Updated OpenAI chat streaming to handle new async iterable format.
- Improved PDF document processing by filtering out empty image buffers.
- Removed unsupported temperature options from OpenAI requests.
## 2025-02-25 - 0.4.1 - fix(provider)
Fix provider modules for consistency
- Updated TypeScript interfaces and options in provider modules for better type safety.
- Modified transform stream handlers in Exo, Groq, and Ollama providers for consistency.
- Added optional model options to OpenAI provider for custom model usage.
## 2025-02-08 - 0.4.0 - feat(core)
Added support for Exo AI provider
- Introduced ExoProvider with chat functionalities.
- Updated SmartAi class to initialize ExoProvider.
- Extended Conversation class to support ExoProvider.
## 2025-02-05 - 0.3.3 - fix(documentation) ## 2025-02-05 - 0.3.3 - fix(documentation)
Update readme with detailed license and legal information. Update readme with detailed license and legal information.

View File

@@ -5,20 +5,33 @@
"githost": "code.foss.global", "githost": "code.foss.global",
"gitscope": "push.rocks", "gitscope": "push.rocks",
"gitrepo": "smartai", "gitrepo": "smartai",
"description": "A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.", "description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.",
"npmPackagename": "@push.rocks/smartai", "npmPackagename": "@push.rocks/smartai",
"license": "MIT", "license": "MIT",
"projectDomain": "push.rocks", "projectDomain": "push.rocks",
"keywords": [ "keywords": [
"AI integration", "AI integration",
"chatbot",
"TypeScript", "TypeScript",
"chatbot",
"OpenAI", "OpenAI",
"Anthropic", "Anthropic",
"multi-model support", "multi-model",
"audio responses", "audio generation",
"text-to-speech", "text-to-speech",
"streaming chat" "document processing",
"vision processing",
"streaming chat",
"API",
"multiple providers",
"AI models",
"synchronous chat",
"asynchronous chat",
"real-time interaction",
"content analysis",
"image description",
"document classification",
"AI toolkit",
"provider switching"
] ]
} }
}, },

View File

@@ -1,37 +1,37 @@
{ {
"name": "@push.rocks/smartai", "name": "@push.rocks/smartai",
"version": "0.3.3", "version": "0.5.9",
"private": false, "private": false,
"description": "A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.", "description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.",
"main": "dist_ts/index.js", "main": "dist_ts/index.js",
"typings": "dist_ts/index.d.ts", "typings": "dist_ts/index.d.ts",
"type": "module", "type": "module",
"author": "Task Venture Capital GmbH", "author": "Task Venture Capital GmbH",
"license": "MIT", "license": "MIT",
"scripts": { "scripts": {
"test": "(tstest test/ --web)", "test": "(tstest test/ --web --verbose)",
"build": "(tsbuild --web --allowimplicitany)", "build": "(tsbuild --web --allowimplicitany)",
"buildDocs": "(tsdoc)" "buildDocs": "(tsdoc)"
}, },
"devDependencies": { "devDependencies": {
"@git.zone/tsbuild": "^2.1.84", "@git.zone/tsbuild": "^2.6.4",
"@git.zone/tsbundle": "^2.0.5", "@git.zone/tsbundle": "^2.5.1",
"@git.zone/tsrun": "^1.2.49", "@git.zone/tsrun": "^1.3.3",
"@git.zone/tstest": "^1.0.90", "@git.zone/tstest": "^2.3.2",
"@push.rocks/qenv": "^6.0.5", "@push.rocks/qenv": "^6.1.0",
"@push.rocks/tapbundle": "^5.3.0", "@push.rocks/tapbundle": "^6.0.3",
"@types/node": "^22.5.5" "@types/node": "^22.15.17"
}, },
"dependencies": { "dependencies": {
"@anthropic-ai/sdk": "^0.27.3", "@anthropic-ai/sdk": "^0.57.0",
"@push.rocks/smartarray": "^1.0.8", "@push.rocks/smartarray": "^1.1.0",
"@push.rocks/smartfile": "^11.0.21", "@push.rocks/smartfile": "^11.2.5",
"@push.rocks/smartpath": "^5.0.18", "@push.rocks/smartpath": "^6.0.0",
"@push.rocks/smartpdf": "^3.1.6", "@push.rocks/smartpdf": "^3.3.0",
"@push.rocks/smartpromise": "^4.0.4", "@push.rocks/smartpromise": "^4.2.3",
"@push.rocks/smartrequest": "^2.0.22", "@push.rocks/smartrequest": "^4.2.1",
"@push.rocks/webstream": "^1.0.10", "@push.rocks/webstream": "^1.0.10",
"openai": "^4.62.1" "openai": "^5.11.0"
}, },
"repository": { "repository": {
"type": "git", "type": "git",
@@ -58,13 +58,33 @@
], ],
"keywords": [ "keywords": [
"AI integration", "AI integration",
"chatbot",
"TypeScript", "TypeScript",
"chatbot",
"OpenAI", "OpenAI",
"Anthropic", "Anthropic",
"multi-model support", "multi-model",
"audio responses", "audio generation",
"text-to-speech", "text-to-speech",
"streaming chat" "document processing",
"vision processing",
"streaming chat",
"API",
"multiple providers",
"AI models",
"synchronous chat",
"asynchronous chat",
"real-time interaction",
"content analysis",
"image description",
"document classification",
"AI toolkit",
"provider switching"
],
"pnpm": {
"onlyBuiltDependencies": [
"esbuild",
"puppeteer"
] ]
},
"packageManager": "pnpm@10.7.0+sha512.6b865ad4b62a1d9842b61d674a393903b871d9244954f652b8842c2b553c72176b278f64c463e52d40fff8aba385c235c8c9ecf5cc7de4fd78b8bb6d49633ab6"
} }

8222
pnpm-lock.yaml generated

File diff suppressed because it is too large Load Diff

604
readme.md
View File

@@ -1,312 +1,466 @@
# @push.rocks/smartai # @push.rocks/smartai
**One API to rule them all** 🚀
[![npm version](https://badge.fury.io/js/%40push.rocks%2Fsmartai.svg)](https://www.npmjs.com/package/@push.rocks/smartai) [![npm version](https://img.shields.io/npm/v/@push.rocks/smartai.svg)](https://www.npmjs.com/package/@push.rocks/smartai)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
SmartAi is a comprehensive TypeScript library that provides a standardized interface for integrating and interacting with multiple AI models. It supports a range of operations from synchronous and streaming chat to audio generation, document processing, and vision tasks. SmartAI unifies the world's leading AI providers - OpenAI, Anthropic, Perplexity, Ollama, Groq, XAI, and Exo - under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in.
## Table of Contents ## 🎯 Why SmartAI?
- [Features](#features) - **🔌 Universal Interface**: Write once, run with any AI provider. Switch between GPT-4, Claude, Llama, or Grok with a single line change.
- [Installation](#installation) - **🛡️ Type-Safe**: Full TypeScript support with comprehensive type definitions for all operations
- [Supported AI Providers](#supported-ai-providers) - **🌊 Streaming First**: Built for real-time applications with native streaming support
- [Quick Start](#quick-start) - **🎨 Multi-Modal**: Seamlessly work with text, images, audio, and documents
- [Usage Examples](#usage-examples) - **🏠 Local & Cloud**: Support for both cloud providers and local models via Ollama
- [Chat Interactions](#chat-interactions) - **⚡ Zero Lock-In**: Your code remains portable across all AI providers
- [Streaming Chat](#streaming-chat)
- [Audio Generation](#audio-generation)
- [Document Processing](#document-processing)
- [Vision Processing](#vision-processing)
- [Error Handling](#error-handling)
- [Development](#development)
- [Running Tests](#running-tests)
- [Building the Project](#building-the-project)
- [Contributing](#contributing)
- [License](#license)
- [Legal Information](#legal-information)
## Features ## 🚀 Quick Start
- **Unified API:** Seamlessly integrate multiple AI providers with a consistent interface.
- **Chat & Streaming:** Support for both synchronous and real-time streaming chat interactions.
- **Audio & Vision:** Generate audio responses and perform detailed image analysis.
- **Document Processing:** Analyze PDFs and other documents using vision models.
- **Extensible:** Easily extend the library to support additional AI providers.
## Installation
To install SmartAi, run the following command:
```bash ```bash
npm install @push.rocks/smartai npm install @push.rocks/smartai
``` ```
This will add the package to your projects dependencies.
## Supported AI Providers
SmartAi supports multiple AI providers. Configure each provider with its corresponding token or settings:
### OpenAI
- **Models:** GPT-4, GPT-3.5-turbo, GPT-4-vision-preview
- **Features:** Chat, Streaming, Audio Generation, Vision, Document Processing
- **Configuration Example:**
```typescript
openaiToken: 'your-openai-token'
```
### X.AI
- **Models:** Grok-2-latest
- **Features:** Chat, Streaming, Document Processing
- **Configuration Example:**
```typescript
xaiToken: 'your-xai-token'
```
### Anthropic
- **Models:** Claude-3-opus-20240229
- **Features:** Chat, Streaming, Vision, Document Processing
- **Configuration Example:**
```typescript
anthropicToken: 'your-anthropic-token'
```
### Perplexity
- **Models:** Mixtral-8x7b-instruct
- **Features:** Chat, Streaming
- **Configuration Example:**
```typescript
perplexityToken: 'your-perplexity-token'
```
### Groq
- **Models:** Llama-3.3-70b-versatile
- **Features:** Chat, Streaming
- **Configuration Example:**
```typescript
groqToken: 'your-groq-token'
```
### Ollama
- **Models:** Configurable (default: llama2; use llava for vision/document tasks)
- **Features:** Chat, Streaming, Vision, Document Processing
- **Configuration Example:**
```typescript
ollama: {
baseUrl: 'http://localhost:11434', // Optional
model: 'llama2', // Optional
visionModel: 'llava' // Optional for vision and document tasks
}
```
## Quick Start
Initialize SmartAi with the provider configurations you plan to use:
```typescript ```typescript
import { SmartAi } from '@push.rocks/smartai'; import { SmartAi } from '@push.rocks/smartai';
const smartAi = new SmartAi({ // Initialize with your favorite providers
openaiToken: 'your-openai-token', const ai = new SmartAi({
xaiToken: 'your-xai-token', openaiToken: 'sk-...',
anthropicToken: 'your-anthropic-token', anthropicToken: 'sk-ant-...'
perplexityToken: 'your-perplexity-token',
groqToken: 'your-groq-token',
ollama: {
baseUrl: 'http://localhost:11434',
model: 'llama2'
}
}); });
await smartAi.start(); await ai.start();
```
## Usage Examples // Same API, multiple providers
const response = await ai.openaiProvider.chat({
### Chat Interactions
**Synchronous Chat:**
```typescript
const response = await smartAi.openaiProvider.chat({
systemMessage: 'You are a helpful assistant.', systemMessage: 'You are a helpful assistant.',
userMessage: 'What is the capital of France?', userMessage: 'Explain quantum computing in simple terms',
messageHistory: [] // Include previous conversation messages if applicable messageHistory: []
}); });
console.log(response.message);
``` ```
### Streaming Chat ## 📊 Provider Capabilities Matrix
**Real-Time Streaming:** Choose the right provider for your use case:
| Provider | Chat | Streaming | TTS | Vision | Documents | Highlights |
|----------|:----:|:---------:|:---:|:------:|:---------:|------------|
| **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | • GPT-4, DALL-E 3<br>• Industry standard<br>• Most features |
| **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | • Claude 3 Opus<br>• Superior reasoning<br>• 200k context |
| **Ollama** | ✅ | ✅ | ❌ | ✅ | ✅ | • 100% local<br>• Privacy-first<br>• No API costs |
| **XAI** | ✅ | ✅ | ❌ | ❌ | ✅ | • Grok models<br>• Real-time data<br>• Uncensored |
| **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | • Web-aware<br>• Research-focused<br>• Citations |
| **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | • 10x faster<br>• LPU inference<br>• Low latency |
| **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | • Distributed<br>• P2P compute<br>• Decentralized |
## 🎮 Core Features
### 💬 Universal Chat Interface
Works identically across all providers:
```typescript ```typescript
const textEncoder = new TextEncoder(); // Use GPT-4 for complex reasoning
const textDecoder = new TextDecoder(); const gptResponse = await ai.openaiProvider.chat({
systemMessage: 'You are a expert physicist.',
userMessage: 'Explain the implications of quantum entanglement',
messageHistory: []
});
// Create a transform stream for sending and receiving data // Use Claude for safety-critical applications
const { writable, readable } = new TransformStream(); const claudeResponse = await ai.anthropicProvider.chat({
const writer = writable.getWriter(); systemMessage: 'You are a medical advisor.',
userMessage: 'Review this patient data for concerns',
messageHistory: []
});
const message = { // Use Groq for lightning-fast responses
role: 'user', const groqResponse = await ai.groqProvider.chat({
content: 'Tell me a story about a brave knight' systemMessage: 'You are a code reviewer.',
}; userMessage: 'Quick! Find the bug in this code: ...',
messageHistory: []
});
```
writer.write(textEncoder.encode(JSON.stringify(message) + '\n')); ### 🌊 Real-Time Streaming
// Start streaming the response Build responsive chat interfaces with token-by-token streaming:
const stream = await smartAi.openaiProvider.chatStream(readable);
```typescript
// Create a chat stream
const stream = await ai.openaiProvider.chatStream(inputStream);
const reader = stream.getReader(); const reader = stream.getReader();
// Display responses as they arrive
while (true) { while (true) {
const { done, value } = await reader.read(); const { done, value } = await reader.read();
if (done) break; if (done) break;
console.log('AI:', value);
// Update UI in real-time
process.stdout.write(value);
} }
``` ```
### Audio Generation ### 🎙️ Text-to-Speech
Generate audio (supported by providers like OpenAI): Generate natural voices with OpenAI:
```typescript ```typescript
const audioStream = await smartAi.openaiProvider.audio({ const audioStream = await ai.openaiProvider.audio({
message: 'Hello, this is a test of text-to-speech' message: 'Welcome to the future of AI development!'
}); });
// Process the audio stream, for example, play it or save to a file. // Stream directly to speakers
audioStream.pipe(speakerOutput);
// Or save to file
audioStream.pipe(fs.createWriteStream('welcome.mp3'));
``` ```
### Document Processing ### 👁️ Vision Analysis
Analyze and extract key information from documents: Understand images with multiple providers:
```typescript ```typescript
// Example using OpenAI const image = fs.readFileSync('product-photo.jpg');
const documentResult = await smartAi.openaiProvider.document({
systemMessage: 'Classify the document type', // OpenAI: General purpose vision
userMessage: 'What type of document is this?', const gptVision = await ai.openaiProvider.vision({
image,
prompt: 'Describe this product and suggest marketing angles'
});
// Anthropic: Detailed analysis
const claudeVision = await ai.anthropicProvider.vision({
image,
prompt: 'Identify any safety concerns or defects'
});
// Ollama: Private, local analysis
const ollamaVision = await ai.ollamaProvider.vision({
image,
prompt: 'Extract all text and categorize the content'
});
```
### 📄 Document Intelligence
Extract insights from PDFs with AI:
```typescript
const contract = fs.readFileSync('contract.pdf');
const invoice = fs.readFileSync('invoice.pdf');
// Analyze documents
const analysis = await ai.openaiProvider.document({
systemMessage: 'You are a legal expert.',
userMessage: 'Compare these documents and highlight key differences',
messageHistory: [], messageHistory: [],
pdfDocuments: [pdfBuffer] // Uint8Array containing the PDF content pdfDocuments: [contract, invoice]
}); });
```
Other providers (e.g., Ollama and Anthropic) follow a similar pattern: // Multi-document analysis
const taxDocs = [form1099, w2, receipts];
```typescript const taxAnalysis = await ai.anthropicProvider.document({
// Using Ollama for document processing systemMessage: 'You are a tax advisor.',
const ollamaResult = await smartAi.ollamaProvider.document({ userMessage: 'Prepare a tax summary from these documents',
systemMessage: 'You are a document analysis assistant',
userMessage: 'Extract key information from this document',
messageHistory: [], messageHistory: [],
pdfDocuments: [pdfBuffer] pdfDocuments: taxDocs
}); });
``` ```
### 🔄 Persistent Conversations
Maintain context across interactions:
```typescript ```typescript
// Using Anthropic for document processing // Create a coding assistant conversation
const anthropicResult = await smartAi.anthropicProvider.document({ const assistant = ai.createConversation('openai');
systemMessage: 'Analyze the document', await assistant.setSystemMessage('You are an expert TypeScript developer.');
userMessage: 'Please extract the main points',
messageHistory: [], // First question
pdfDocuments: [pdfBuffer] const inputWriter = assistant.getInputStreamWriter();
}); await inputWriter.write('How do I implement a singleton pattern?');
// Continue the conversation
await inputWriter.write('Now show me how to make it thread-safe');
// The assistant remembers the entire context
``` ```
### Vision Processing ## 🚀 Real-World Examples
Analyze images with vision capabilities: ### Build a Customer Support Bot
```typescript ```typescript
// Using OpenAI GPT-4 Vision const supportBot = new SmartAi({
const imageDescription = await smartAi.openaiProvider.vision({ anthropicToken: process.env.ANTHROPIC_KEY // Claude for empathetic responses
image: imageBuffer, // Uint8Array containing image data
prompt: 'What do you see in this image?'
}); });
// Using Ollama for vision tasks async function handleCustomerQuery(query: string, history: ChatMessage[]) {
const ollamaImageAnalysis = await smartAi.ollamaProvider.vision({ try {
image: imageBuffer, const response = await supportBot.anthropicProvider.chat({
prompt: 'Analyze this image in detail' systemMessage: `You are a helpful customer support agent.
}); Be empathetic, professional, and solution-oriented.`,
userMessage: query,
messageHistory: history
});
// Using Anthropic for vision analysis return response.message;
const anthropicImageAnalysis = await smartAi.anthropicProvider.vision({ } catch (error) {
image: imageBuffer, // Fallback to another provider if needed
prompt: 'Describe the contents of this image' return await supportBot.openaiProvider.chat({...});
}); }
}
``` ```
## Error Handling ### Create a Code Review Assistant
Always wrap API calls in try-catch blocks to manage errors effectively:
```typescript ```typescript
try { const codeReviewer = new SmartAi({
const response = await smartAi.openaiProvider.chat({ groqToken: process.env.GROQ_KEY // Groq for speed
systemMessage: 'You are a helpful assistant.', });
userMessage: 'Hello!',
async function reviewCode(code: string, language: string) {
const startTime = Date.now();
const review = await codeReviewer.groqProvider.chat({
systemMessage: `You are a ${language} expert. Review code for:
- Security vulnerabilities
- Performance issues
- Best practices
- Potential bugs`,
userMessage: `Review this code:\n\n${code}`,
messageHistory: [] messageHistory: []
}); });
console.log(response.message);
} catch (error: any) { console.log(`Review completed in ${Date.now() - startTime}ms`);
console.error('AI provider error:', error.message); return review.message;
} }
``` ```
## Development ### Build a Research Assistant
### Running Tests ```typescript
const researcher = new SmartAi({
perplexityToken: process.env.PERPLEXITY_KEY
});
To run the test suite, use the following command: async function research(topic: string) {
// Perplexity excels at web-aware research
const findings = await researcher.perplexityProvider.chat({
systemMessage: 'You are a research assistant. Provide factual, cited information.',
userMessage: `Research the latest developments in ${topic}`,
messageHistory: []
});
```bash return findings.message;
npm run test }
``` ```
Ensure your environment is configured with the appropriate tokens and settings for the providers you are testing. ### Local AI for Sensitive Data
### Building the Project ```typescript
const localAI = new SmartAi({
ollama: {
baseUrl: 'http://localhost:11434',
model: 'llama2',
visionModel: 'llava'
}
});
Compile the TypeScript code and build the package using: // Process sensitive documents without leaving your infrastructure
async function analyzeSensitiveDoc(pdfBuffer: Buffer) {
const analysis = await localAI.ollamaProvider.document({
systemMessage: 'Extract and summarize key information.',
userMessage: 'Analyze this confidential document',
messageHistory: [],
pdfDocuments: [pdfBuffer]
});
```bash // Data never leaves your servers
npm run build return analysis.message;
}
``` ```
This command prepares the library for distribution. ## ⚡ Performance Tips
## Contributing ### 1. Provider Selection Strategy
Contributions are welcome! Please follow these steps: ```typescript
class SmartAIRouter {
constructor(private ai: SmartAi) {}
1. Fork the repository. async query(message: string, requirements: {
2. Create a feature branch: speed?: boolean;
```bash accuracy?: boolean;
git checkout -b feature/my-feature cost?: boolean;
``` privacy?: boolean;
3. Commit your changes with clear messages: }) {
```bash if (requirements.privacy) {
git commit -m 'Add new feature' return this.ai.ollamaProvider.chat({...}); // Local only
``` }
4. Push your branch to your fork: if (requirements.speed) {
```bash return this.ai.groqProvider.chat({...}); // 10x faster
git push origin feature/my-feature }
``` if (requirements.accuracy) {
5. Open a Pull Request with a detailed description of your changes. return this.ai.anthropicProvider.chat({...}); // Best reasoning
}
// Default fallback
return this.ai.openaiProvider.chat({...});
}
}
```
### 2. Streaming for Large Responses
```typescript
// Don't wait for the entire response
async function streamResponse(userQuery: string) {
const stream = await ai.openaiProvider.chatStream(createInputStream(userQuery));
// Process tokens as they arrive
for await (const chunk of stream) {
updateUI(chunk); // Immediate feedback
await processChunk(chunk); // Parallel processing
}
}
```
### 3. Parallel Multi-Provider Queries
```typescript
// Get the best answer from multiple AIs
async function consensusQuery(question: string) {
const providers = [
ai.openaiProvider.chat({...}),
ai.anthropicProvider.chat({...}),
ai.perplexityProvider.chat({...})
];
const responses = await Promise.all(providers);
return synthesizeResponses(responses);
}
```
## 🛠️ Advanced Features
### Custom Streaming Transformations
```typescript
// Add real-time translation
const translationStream = new TransformStream({
async transform(chunk, controller) {
const translated = await translateChunk(chunk);
controller.enqueue(translated);
}
});
const responseStream = await ai.openaiProvider.chatStream(input);
const translatedStream = responseStream.pipeThrough(translationStream);
```
### Error Handling & Fallbacks
```typescript
class ResilientAI {
private providers = ['openai', 'anthropic', 'groq'];
async query(opts: ChatOptions): Promise<ChatResponse> {
for (const provider of this.providers) {
try {
return await this.ai[`${provider}Provider`].chat(opts);
} catch (error) {
console.warn(`${provider} failed, trying next...`);
continue;
}
}
throw new Error('All providers failed');
}
}
```
### Token Counting & Cost Management
```typescript
// Track usage across providers
class UsageTracker {
async trackedChat(provider: string, options: ChatOptions) {
const start = Date.now();
const response = await ai[`${provider}Provider`].chat(options);
const usage = {
provider,
duration: Date.now() - start,
inputTokens: estimateTokens(options),
outputTokens: estimateTokens(response.message)
};
await this.logUsage(usage);
return response;
}
}
```
## 📦 Installation & Setup
### Prerequisites
- Node.js 16+
- TypeScript 4.5+
- API keys for your chosen providers
### Environment Setup
```bash
# Install
npm install @push.rocks/smartai
# Set up environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export PERPLEXITY_API_KEY=pplx-...
# ... etc
```
### TypeScript Configuration
```json
{
"compilerOptions": {
"target": "ES2022",
"module": "NodeNext",
"lib": ["ES2022"],
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true
}
}
```
## 🎯 Choosing the Right Provider
| Use Case | Recommended Provider | Why |
|----------|---------------------|-----|
| **General Purpose** | OpenAI | Most features, stable, well-documented |
| **Complex Reasoning** | Anthropic | Superior logical thinking, safer outputs |
| **Research & Facts** | Perplexity | Web-aware, provides citations |
| **Speed Critical** | Groq | 10x faster inference, sub-second responses |
| **Privacy Critical** | Ollama | 100% local, no data leaves your servers |
| **Real-time Data** | XAI | Access to current information |
| **Cost Sensitive** | Ollama/Exo | Free (local) or distributed compute |
## 📈 Roadmap
- [ ] Streaming function calls
- [ ] Image generation support
- [ ] Voice input processing
- [ ] Fine-tuning integration
- [ ] Embedding support
- [ ] Agent framework
- [ ] More providers (Cohere, AI21, etc.)
## License and Legal Information ## License and Legal Information

View File

@@ -1,4 +1,4 @@
import { expect, expectAsync, tap } from '@push.rocks/tapbundle'; import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv'; import * as qenv from '@push.rocks/qenv';
import * as smartrequest from '@push.rocks/smartrequest'; import * as smartrequest from '@push.rocks/smartrequest';
import * as smartfile from '@push.rocks/smartfile'; import * as smartfile from '@push.rocks/smartfile';
@@ -21,8 +21,7 @@ tap.test('should create chat response with openai', async () => {
const response = await testSmartai.openaiProvider.chat({ const response = await testSmartai.openaiProvider.chat({
systemMessage: 'Hello', systemMessage: 'Hello',
userMessage: userMessage, userMessage: userMessage,
messageHistory: [ messageHistory: [],
],
}); });
console.log(`userMessage: ${userMessage}`); console.log(`userMessage: ${userMessage}`);
console.log(response.message); console.log(response.message);
@@ -30,12 +29,14 @@ tap.test('should create chat response with openai', async () => {
tap.test('should document a pdf', async () => { tap.test('should document a pdf', async () => {
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf'; const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.getBinary(pdfUrl); const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl)
.get();
const result = await testSmartai.openaiProvider.document({ const result = await testSmartai.openaiProvider.document({
systemMessage: 'Classify the document. Only the following answers are allowed: "invoice", "bank account statement", "contract", "other". The answer should only contain the keyword for machine use.', systemMessage: 'Classify the document. Only the following answers are allowed: "invoice", "bank account statement", "contract", "other". The answer should only contain the keyword for machine use.',
userMessage: "Classify the document.", userMessage: "Classify the document.",
messageHistory: [], messageHistory: [],
pdfDocuments: [pdfResponse.body], pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())],
}); });
console.log(result); console.log(result);
}); });
@@ -55,7 +56,7 @@ tap.test('should recognize companies in a pdf', async () => {
address: string; address: string;
city: string; city: string;
country: string; country: string;
EU: boolean; // wether the entity is within EU EU: boolean; // whether the entity is within EU
}; };
entityReceiver: { entityReceiver: {
type: 'official state entity' | 'company' | 'person'; type: 'official state entity' | 'company' | 'person';
@@ -63,7 +64,7 @@ tap.test('should recognize companies in a pdf', async () => {
address: string; address: string;
city: string; city: string;
country: string; country: string;
EU: boolean; // wether the entity is within EU EU: boolean; // whether the entity is within EU
}; };
date: string; // the date of the document as YYYY-MM-DD date: string; // the date of the document as YYYY-MM-DD
title: string; // a short title, suitable for a filename title: string; // a short title, suitable for a filename
@@ -75,7 +76,24 @@ tap.test('should recognize companies in a pdf', async () => {
pdfDocuments: [pdfBuffer], pdfDocuments: [pdfBuffer],
}); });
console.log(result); console.log(result);
}) });
tap.test('should create audio response with openai', async () => {
// Call the audio method with a sample message.
const audioStream = await testSmartai.openaiProvider.audio({
message: 'This is a test of audio generation.',
});
// Read all chunks from the stream.
const chunks: Uint8Array[] = [];
for await (const chunk of audioStream) {
chunks.push(chunk as Uint8Array);
}
const audioBuffer = Buffer.concat(chunks);
await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput.mp3');
console.log(`Audio Buffer length: ${audioBuffer.length}`);
// Assert that the resulting buffer is not empty.
expect(audioBuffer.length).toBeGreaterThan(0);
});
tap.test('should stop the smartai instance', async () => { tap.test('should stop the smartai instance', async () => {
await testSmartai.stop(); await testSmartai.stop();

View File

@@ -3,6 +3,6 @@
*/ */
export const commitinfo = { export const commitinfo = {
name: '@push.rocks/smartai', name: '@push.rocks/smartai',
version: '0.3.3', version: '0.5.4',
description: 'A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.' description: 'SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.'
} }

View File

@@ -1,3 +1,5 @@
import * as plugins from './plugins.js';
/** /**
* Message format for chat interactions * Message format for chat interactions
*/ */
@@ -28,17 +30,30 @@ export interface ChatResponse {
* Provides a common interface for different AI providers (OpenAI, Anthropic, Perplexity, Ollama) * Provides a common interface for different AI providers (OpenAI, Anthropic, Perplexity, Ollama)
*/ */
export abstract class MultiModalModel { export abstract class MultiModalModel {
/**
* SmartPdf instance for document processing
* Shared across all methods that need PDF functionality
*/
protected smartpdfInstance: plugins.smartpdf.SmartPdf;
/** /**
* Initializes the model and any necessary resources * Initializes the model and any necessary resources
* Should be called before using any other methods * Should be called before using any other methods
*/ */
abstract start(): Promise<void>; public async start(): Promise<void> {
this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
await this.smartpdfInstance.start();
}
/** /**
* Cleans up any resources used by the model * Cleans up any resources used by the model
* Should be called when the model is no longer needed * Should be called when the model is no longer needed
*/ */
abstract stop(): Promise<void>; public async stop(): Promise<void> {
if (this.smartpdfInstance) {
await this.smartpdfInstance.stop();
}
}
/** /**
* Synchronous chat interaction with the model * Synchronous chat interaction with the model

View File

@@ -48,6 +48,18 @@ export class Conversation {
return conversation; return conversation;
} }
public static async createWithExo(smartaiRefArg: SmartAi) {
if (!smartaiRefArg.exoProvider) {
throw new Error('Exo provider not available');
}
const conversation = new Conversation(smartaiRefArg, {
processFunction: async (input) => {
return '' // TODO implement proper streaming
}
});
return conversation;
}
public static async createWithOllama(smartaiRefArg: SmartAi) { public static async createWithOllama(smartaiRefArg: SmartAi) {
if (!smartaiRefArg.ollamaProvider) { if (!smartaiRefArg.ollamaProvider) {
throw new Error('Ollama provider not available'); throw new Error('Ollama provider not available');
@@ -60,6 +72,30 @@ export class Conversation {
return conversation; return conversation;
} }
public static async createWithGroq(smartaiRefArg: SmartAi) {
if (!smartaiRefArg.groqProvider) {
throw new Error('Groq provider not available');
}
const conversation = new Conversation(smartaiRefArg, {
processFunction: async (input) => {
return '' // TODO implement proper streaming
}
});
return conversation;
}
public static async createWithXai(smartaiRefArg: SmartAi) {
if (!smartaiRefArg.xaiProvider) {
throw new Error('XAI provider not available');
}
const conversation = new Conversation(smartaiRefArg, {
processFunction: async (input) => {
return '' // TODO implement proper streaming
}
});
return conversation;
}
// INSTANCE // INSTANCE
smartaiRef: SmartAi smartaiRef: SmartAi
private systemMessage: string; private systemMessage: string;

View File

@@ -1,18 +1,32 @@
import { Conversation } from './classes.conversation.js'; import { Conversation } from './classes.conversation.js';
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import { AnthropicProvider } from './provider.anthropic.js'; import { AnthropicProvider } from './provider.anthropic.js';
import type { OllamaProvider } from './provider.ollama.js'; import { OllamaProvider } from './provider.ollama.js';
import { OpenAiProvider } from './provider.openai.js'; import { OpenAiProvider } from './provider.openai.js';
import type { PerplexityProvider } from './provider.perplexity.js'; import { PerplexityProvider } from './provider.perplexity.js';
import { ExoProvider } from './provider.exo.js';
import { GroqProvider } from './provider.groq.js';
import { XAIProvider } from './provider.xai.js';
export interface ISmartAiOptions { export interface ISmartAiOptions {
openaiToken?: string; openaiToken?: string;
anthropicToken?: string; anthropicToken?: string;
perplexityToken?: string; perplexityToken?: string;
groqToken?: string;
xaiToken?: string;
exo?: {
baseUrl?: string;
apiKey?: string;
};
ollama?: {
baseUrl?: string;
model?: string;
visionModel?: string;
};
} }
export type TProvider = 'openai' | 'anthropic' | 'perplexity' | 'ollama'; export type TProvider = 'openai' | 'anthropic' | 'perplexity' | 'ollama' | 'exo' | 'groq' | 'xai';
export class SmartAi { export class SmartAi {
public options: ISmartAiOptions; public options: ISmartAiOptions;
@@ -21,6 +35,9 @@ export class SmartAi {
public anthropicProvider: AnthropicProvider; public anthropicProvider: AnthropicProvider;
public perplexityProvider: PerplexityProvider; public perplexityProvider: PerplexityProvider;
public ollamaProvider: OllamaProvider; public ollamaProvider: OllamaProvider;
public exoProvider: ExoProvider;
public groqProvider: GroqProvider;
public xaiProvider: XAIProvider;
constructor(optionsArg: ISmartAiOptions) { constructor(optionsArg: ISmartAiOptions) {
this.options = optionsArg; this.options = optionsArg;
@@ -37,16 +54,74 @@ export class SmartAi {
this.anthropicProvider = new AnthropicProvider({ this.anthropicProvider = new AnthropicProvider({
anthropicToken: this.options.anthropicToken, anthropicToken: this.options.anthropicToken,
}); });
await this.anthropicProvider.start();
}
if (this.options.perplexityToken) {
this.perplexityProvider = new PerplexityProvider({
perplexityToken: this.options.perplexityToken,
});
await this.perplexityProvider.start();
}
if (this.options.groqToken) {
this.groqProvider = new GroqProvider({
groqToken: this.options.groqToken,
});
await this.groqProvider.start();
}
if (this.options.xaiToken) {
this.xaiProvider = new XAIProvider({
xaiToken: this.options.xaiToken,
});
await this.xaiProvider.start();
}
if (this.options.ollama) {
this.ollamaProvider = new OllamaProvider({
baseUrl: this.options.ollama.baseUrl,
model: this.options.ollama.model,
visionModel: this.options.ollama.visionModel,
});
await this.ollamaProvider.start();
}
if (this.options.exo) {
this.exoProvider = new ExoProvider({
exoBaseUrl: this.options.exo.baseUrl,
apiKey: this.options.exo.apiKey,
});
await this.exoProvider.start();
} }
} }
public async stop() {} public async stop() {
if (this.openaiProvider) {
await this.openaiProvider.stop();
}
if (this.anthropicProvider) {
await this.anthropicProvider.stop();
}
if (this.perplexityProvider) {
await this.perplexityProvider.stop();
}
if (this.groqProvider) {
await this.groqProvider.stop();
}
if (this.xaiProvider) {
await this.xaiProvider.stop();
}
if (this.ollamaProvider) {
await this.ollamaProvider.stop();
}
if (this.exoProvider) {
await this.exoProvider.stop();
}
}
/** /**
* create a new conversation * create a new conversation
*/ */
createConversation(provider: TProvider) { createConversation(provider: TProvider) {
switch (provider) { switch (provider) {
case 'exo':
return Conversation.createWithExo(this);
case 'openai': case 'openai':
return Conversation.createWithOpenAi(this); return Conversation.createWithOpenAi(this);
case 'anthropic': case 'anthropic':
@@ -55,6 +130,10 @@ export class SmartAi {
return Conversation.createWithPerplexity(this); return Conversation.createWithPerplexity(this);
case 'ollama': case 'ollama':
return Conversation.createWithOllama(this); return Conversation.createWithOllama(this);
case 'groq':
return Conversation.createWithGroq(this);
case 'xai':
return Conversation.createWithXai(this);
default: default:
throw new Error('Provider not available'); throw new Error('Provider not available');
} }

View File

@@ -20,12 +20,15 @@ export class AnthropicProvider extends MultiModalModel {
} }
async start() { async start() {
await super.start();
this.anthropicApiClient = new plugins.anthropic.default({ this.anthropicApiClient = new plugins.anthropic.default({
apiKey: this.options.anthropicToken, apiKey: this.options.anthropicToken,
}); });
} }
async stop() {} async stop() {
await super.stop();
}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> { public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
// Create a TextDecoder to handle incoming chunks // Create a TextDecoder to handle incoming chunks
@@ -178,11 +181,10 @@ export class AnthropicProvider extends MultiModalModel {
messageHistory: ChatMessage[]; messageHistory: ChatMessage[];
}): Promise<{ message: any }> { }): Promise<{ message: any }> {
// Convert PDF documents to images using SmartPDF // Convert PDF documents to images using SmartPDF
const smartpdfInstance = new plugins.smartpdf.SmartPdf();
let documentImageBytesArray: Uint8Array[] = []; let documentImageBytesArray: Uint8Array[] = [];
for (const pdfDocument of optionsArg.pdfDocuments) { for (const pdfDocument of optionsArg.pdfDocuments) {
const documentImageArray = await smartpdfInstance.convertPDFToPngBytes(pdfDocument); const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
documentImageBytesArray = documentImageBytesArray.concat(documentImageArray); documentImageBytesArray = documentImageBytesArray.concat(documentImageArray);
} }

128
ts/provider.exo.ts Normal file
View File

@@ -0,0 +1,128 @@
import * as plugins from './plugins.js';
import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage } from './abstract.classes.multimodal.js';
import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions';
export interface IExoProviderOptions {
exoBaseUrl?: string;
apiKey?: string;
}
export class ExoProvider extends MultiModalModel {
private options: IExoProviderOptions;
public openAiApiClient: plugins.openai.default;
constructor(optionsArg: IExoProviderOptions = {}) {
super();
this.options = {
exoBaseUrl: 'http://localhost:8080/v1', // Default Exo API endpoint
...optionsArg
};
}
public async start() {
this.openAiApiClient = new plugins.openai.default({
apiKey: this.options.apiKey || 'not-needed', // Exo might not require an API key for local deployment
baseURL: this.options.exoBaseUrl,
});
}
public async stop() {}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
// Create a TextDecoder to handle incoming chunks
const decoder = new TextDecoder();
let buffer = '';
let currentMessage: { role: string; content: string; } | null = null;
// Create a TransformStream to process the input
const transform = new TransformStream<Uint8Array, string>({
transform: async (chunk, controller) => {
buffer += decoder.decode(chunk, { stream: true });
// Try to parse complete JSON messages from the buffer
while (true) {
const newlineIndex = buffer.indexOf('\n');
if (newlineIndex === -1) break;
const line = buffer.slice(0, newlineIndex);
buffer = buffer.slice(newlineIndex + 1);
if (line.trim()) {
try {
const message = JSON.parse(line);
currentMessage = message;
// Process the message based on its type
if (message.type === 'message') {
const response = await this.chat({
systemMessage: '',
userMessage: message.content,
messageHistory: [{ role: message.role as 'user' | 'assistant' | 'system', content: message.content }]
});
controller.enqueue(JSON.stringify(response) + '\n');
}
} catch (error) {
console.error('Error processing message:', error);
}
}
}
},
flush(controller) {
if (buffer) {
try {
const message = JSON.parse(buffer);
currentMessage = message;
} catch (error) {
console.error('Error processing remaining buffer:', error);
}
}
}
});
return input.pipeThrough(transform);
}
public async chat(options: ChatOptions): Promise<ChatResponse> {
const messages: ChatCompletionMessageParam[] = [
{ role: 'system', content: options.systemMessage },
...options.messageHistory,
{ role: 'user', content: options.userMessage }
];
try {
const response = await this.openAiApiClient.chat.completions.create({
model: 'local-model', // Exo uses local models
messages: messages,
stream: false
});
return {
role: 'assistant',
message: response.choices[0]?.message?.content || ''
};
} catch (error) {
console.error('Error in chat completion:', error);
throw error;
}
}
public async audio(optionsArg: { message: string }): Promise<NodeJS.ReadableStream> {
throw new Error('Audio generation is not supported by Exo provider');
}
public async vision(optionsArg: { image: Buffer; prompt: string }): Promise<string> {
throw new Error('Vision processing is not supported by Exo provider');
}
public async document(optionsArg: {
systemMessage: string;
userMessage: string;
pdfDocuments: Uint8Array[];
messageHistory: ChatMessage[];
}): Promise<{ message: any }> {
throw new Error('Document processing is not supported by Exo provider');
}
}

View File

@@ -32,7 +32,7 @@ export class GroqProvider extends MultiModalModel {
// Create a TransformStream to process the input // Create a TransformStream to process the input
const transform = new TransformStream<Uint8Array, string>({ const transform = new TransformStream<Uint8Array, string>({
async transform(chunk, controller) { transform: async (chunk, controller) => {
buffer += decoder.decode(chunk, { stream: true }); buffer += decoder.decode(chunk, { stream: true });
// Try to parse complete JSON messages from the buffer // Try to parse complete JSON messages from the buffer

View File

@@ -24,6 +24,7 @@ export class OllamaProvider extends MultiModalModel {
} }
async start() { async start() {
await super.start();
// Verify Ollama is running // Verify Ollama is running
try { try {
const response = await fetch(`${this.baseUrl}/api/tags`); const response = await fetch(`${this.baseUrl}/api/tags`);
@@ -35,7 +36,9 @@ export class OllamaProvider extends MultiModalModel {
} }
} }
async stop() {} async stop() {
await super.stop();
}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> { public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
// Create a TextDecoder to handle incoming chunks // Create a TextDecoder to handle incoming chunks
@@ -45,7 +48,7 @@ export class OllamaProvider extends MultiModalModel {
// Create a TransformStream to process the input // Create a TransformStream to process the input
const transform = new TransformStream<Uint8Array, string>({ const transform = new TransformStream<Uint8Array, string>({
async transform(chunk, controller) { transform: async (chunk, controller) => {
buffer += decoder.decode(chunk, { stream: true }); buffer += decoder.decode(chunk, { stream: true });
// Try to parse complete JSON messages from the buffer // Try to parse complete JSON messages from the buffer
@@ -205,11 +208,10 @@ export class OllamaProvider extends MultiModalModel {
messageHistory: ChatMessage[]; messageHistory: ChatMessage[];
}): Promise<{ message: any }> { }): Promise<{ message: any }> {
// Convert PDF documents to images using SmartPDF // Convert PDF documents to images using SmartPDF
const smartpdfInstance = new plugins.smartpdf.SmartPdf();
let documentImageBytesArray: Uint8Array[] = []; let documentImageBytesArray: Uint8Array[] = [];
for (const pdfDocument of optionsArg.pdfDocuments) { for (const pdfDocument of optionsArg.pdfDocuments) {
const documentImageArray = await smartpdfInstance.convertPDFToPngBytes(pdfDocument); const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
documentImageBytesArray = documentImageBytesArray.concat(documentImageArray); documentImageBytesArray = documentImageBytesArray.concat(documentImageArray);
} }

View File

@@ -1,16 +1,26 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { Readable } from 'stream';
// Custom type definition for chat completion messages
export type TChatCompletionRequestMessage = {
role: "system" | "user" | "assistant";
content: string;
};
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
export interface IOpenaiProviderOptions { export interface IOpenaiProviderOptions {
openaiToken: string; openaiToken: string;
chatModel?: string;
audioModel?: string;
visionModel?: string;
// Optionally add more model options (e.g., documentModel) if needed.
} }
export class OpenAiProvider extends MultiModalModel { export class OpenAiProvider extends MultiModalModel {
private options: IOpenaiProviderOptions; private options: IOpenaiProviderOptions;
public openAiApiClient: plugins.openai.default; public openAiApiClient: plugins.openai.default;
public smartpdfInstance: plugins.smartpdf.SmartPdf;
constructor(optionsArg: IOpenaiProviderOptions) { constructor(optionsArg: IOpenaiProviderOptions) {
super(); super();
@@ -18,24 +28,29 @@ export class OpenAiProvider extends MultiModalModel {
} }
public async start() { public async start() {
await super.start();
this.openAiApiClient = new plugins.openai.default({ this.openAiApiClient = new plugins.openai.default({
apiKey: this.options.openaiToken, apiKey: this.options.openaiToken,
dangerouslyAllowBrowser: true, dangerouslyAllowBrowser: true,
}); });
this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
} }
public async stop() {} public async stop() {
await super.stop();
}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> { public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
// Create a TextDecoder to handle incoming chunks // Create a TextDecoder to handle incoming chunks
const decoder = new TextDecoder(); const decoder = new TextDecoder();
let buffer = ''; let buffer = '';
let currentMessage: { role: string; content: string; } | null = null; let currentMessage: {
role: "function" | "user" | "system" | "assistant" | "tool" | "developer";
content: string;
} | null = null;
// Create a TransformStream to process the input // Create a TransformStream to process the input
const transform = new TransformStream<Uint8Array, string>({ const transform = new TransformStream<Uint8Array, string>({
async transform(chunk, controller) { transform: async (chunk, controller) => {
buffer += decoder.decode(chunk, { stream: true }); buffer += decoder.decode(chunk, { stream: true });
// Try to parse complete JSON messages from the buffer // Try to parse complete JSON messages from the buffer
@@ -50,7 +65,7 @@ export class OpenAiProvider extends MultiModalModel {
try { try {
const message = JSON.parse(line); const message = JSON.parse(line);
currentMessage = { currentMessage = {
role: message.role || 'user', role: (message.role || 'user') as "function" | "user" | "system" | "assistant" | "tool" | "developer",
content: message.content || '', content: message.content || '',
}; };
} catch (e) { } catch (e) {
@@ -61,20 +76,24 @@ export class OpenAiProvider extends MultiModalModel {
// If we have a complete message, send it to OpenAI // If we have a complete message, send it to OpenAI
if (currentMessage) { if (currentMessage) {
const stream = await this.openAiApiClient.chat.completions.create({ const messageToSend = { role: "user" as const, content: currentMessage.content };
model: 'gpt-4', const chatModel = this.options.chatModel ?? 'o3-mini';
messages: [{ role: currentMessage.role, content: currentMessage.content }], const requestParams: any = {
model: chatModel,
messages: [messageToSend],
stream: true, stream: true,
}); };
// Temperature is omitted since the model does not support it.
const stream = await this.openAiApiClient.chat.completions.create(requestParams);
// Explicitly cast the stream as an async iterable to satisfy TypeScript.
const streamAsyncIterable = stream as unknown as AsyncIterableIterator<any>;
// Process each chunk from OpenAI // Process each chunk from OpenAI
for await (const chunk of stream) { for await (const chunk of streamAsyncIterable) {
const content = chunk.choices[0]?.delta?.content; const content = chunk.choices[0]?.delta?.content;
if (content) { if (content) {
controller.enqueue(content); controller.enqueue(content);
} }
} }
currentMessage = null; currentMessage = null;
} }
}, },
@@ -104,15 +123,17 @@ export class OpenAiProvider extends MultiModalModel {
content: string; content: string;
}[]; }[];
}) { }) {
const result = await this.openAiApiClient.chat.completions.create({ const chatModel = this.options.chatModel ?? 'o3-mini';
model: 'gpt-4o', const requestParams: any = {
model: chatModel,
messages: [ messages: [
{ role: 'system', content: optionsArg.systemMessage }, { role: 'system', content: optionsArg.systemMessage },
...optionsArg.messageHistory, ...optionsArg.messageHistory,
{ role: 'user', content: optionsArg.userMessage }, { role: 'user', content: optionsArg.userMessage },
], ],
}); };
// Temperature parameter removed to avoid unsupported error.
const result = await this.openAiApiClient.chat.completions.create(requestParams);
return { return {
role: result.choices[0].message.role as 'assistant', role: result.choices[0].message.role as 'assistant',
message: result.choices[0].message.content, message: result.choices[0].message.content,
@@ -122,14 +143,15 @@ export class OpenAiProvider extends MultiModalModel {
public async audio(optionsArg: { message: string }): Promise<NodeJS.ReadableStream> { public async audio(optionsArg: { message: string }): Promise<NodeJS.ReadableStream> {
const done = plugins.smartpromise.defer<NodeJS.ReadableStream>(); const done = plugins.smartpromise.defer<NodeJS.ReadableStream>();
const result = await this.openAiApiClient.audio.speech.create({ const result = await this.openAiApiClient.audio.speech.create({
model: 'tts-1-hd', model: this.options.audioModel ?? 'tts-1-hd',
input: optionsArg.message, input: optionsArg.message,
voice: 'nova', voice: 'nova',
response_format: 'mp3', response_format: 'mp3',
speed: 1, speed: 1,
}); });
const stream = result.body; const stream = result.body;
done.resolve(stream); const nodeStream = Readable.fromWeb(stream as any);
done.resolve(nodeStream);
return done.promise; return done.promise;
} }
@@ -144,6 +166,7 @@ export class OpenAiProvider extends MultiModalModel {
}) { }) {
let pdfDocumentImageBytesArray: Uint8Array[] = []; let pdfDocumentImageBytesArray: Uint8Array[] = [];
// Convert each PDF into one or more image byte arrays.
for (const pdfDocument of optionsArg.pdfDocuments) { for (const pdfDocument of optionsArg.pdfDocuments) {
const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument); const documentImageArray = await this.smartpdfInstance.convertPDFToPngBytes(pdfDocument);
pdfDocumentImageBytesArray = pdfDocumentImageBytesArray.concat(documentImageArray); pdfDocumentImageBytesArray = pdfDocumentImageBytesArray.concat(documentImageArray);
@@ -152,19 +175,18 @@ export class OpenAiProvider extends MultiModalModel {
console.log(`image smartfile array`); console.log(`image smartfile array`);
console.log(pdfDocumentImageBytesArray.map((smartfile) => smartfile.length)); console.log(pdfDocumentImageBytesArray.map((smartfile) => smartfile.length));
const smartfileArray = await plugins.smartarray.map( // Filter out any empty buffers to avoid sending invalid image URLs.
pdfDocumentImageBytesArray, const validImageBytesArray = pdfDocumentImageBytesArray.filter(imageBytes => imageBytes && imageBytes.length > 0);
async (pdfDocumentImageBytes) => { const imageAttachments = validImageBytesArray.map(imageBytes => ({
return plugins.smartfile.SmartFile.fromBuffer( type: 'image_url',
'pdfDocumentImage.jpg', image_url: {
Buffer.from(pdfDocumentImageBytes) url: 'data:image/png;base64,' + Buffer.from(imageBytes).toString('base64'),
); },
} }));
);
const result = await this.openAiApiClient.chat.completions.create({ const chatModel = this.options.chatModel ?? 'o4-mini';
model: 'gpt-4o', const requestParams: any = {
// response_format: { type: "json_object" }, // not supported for now model: chatModel,
messages: [ messages: [
{ role: 'system', content: optionsArg.systemMessage }, { role: 'system', content: optionsArg.systemMessage },
...optionsArg.messageHistory, ...optionsArg.messageHistory,
@@ -172,30 +194,22 @@ export class OpenAiProvider extends MultiModalModel {
role: 'user', role: 'user',
content: [ content: [
{ type: 'text', text: optionsArg.userMessage }, { type: 'text', text: optionsArg.userMessage },
...(() => { ...imageAttachments,
const returnArray = [];
for (const imageBytes of pdfDocumentImageBytesArray) {
returnArray.push({
type: 'image_url',
image_url: {
url: 'data:image/png;base64,' + Buffer.from(imageBytes).toString('base64'),
},
});
}
return returnArray;
})(),
], ],
}, },
], ],
}); };
// Temperature parameter removed.
const result = await this.openAiApiClient.chat.completions.create(requestParams);
return { return {
message: result.choices[0].message, message: result.choices[0].message,
}; };
} }
public async vision(optionsArg: { image: Buffer; prompt: string }): Promise<string> { public async vision(optionsArg: { image: Buffer; prompt: string }): Promise<string> {
const result = await this.openAiApiClient.chat.completions.create({ const visionModel = this.options.visionModel ?? '04-mini';
model: 'gpt-4-vision-preview', const requestParams: any = {
model: visionModel,
messages: [ messages: [
{ {
role: 'user', role: 'user',
@@ -211,8 +225,8 @@ export class OpenAiProvider extends MultiModalModel {
} }
], ],
max_tokens: 300 max_tokens: 300
}); };
const result = await this.openAiApiClient.chat.completions.create(requestParams);
return result.choices[0].message.content || ''; return result.choices[0].message.content || '';
} }
} }

View File

@@ -11,7 +11,6 @@ export interface IXAIProviderOptions {
export class XAIProvider extends MultiModalModel { export class XAIProvider extends MultiModalModel {
private options: IXAIProviderOptions; private options: IXAIProviderOptions;
public openAiApiClient: plugins.openai.default; public openAiApiClient: plugins.openai.default;
public smartpdfInstance: plugins.smartpdf.SmartPdf;
constructor(optionsArg: IXAIProviderOptions) { constructor(optionsArg: IXAIProviderOptions) {
super(); super();
@@ -19,14 +18,16 @@ export class XAIProvider extends MultiModalModel {
} }
public async start() { public async start() {
await super.start();
this.openAiApiClient = new plugins.openai.default({ this.openAiApiClient = new plugins.openai.default({
apiKey: this.options.xaiToken, apiKey: this.options.xaiToken,
baseURL: 'https://api.x.ai/v1', baseURL: 'https://api.x.ai/v1',
}); });
this.smartpdfInstance = new plugins.smartpdf.SmartPdf();
} }
public async stop() {} public async stop() {
await super.stop();
}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> { public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
// Create a TextDecoder to handle incoming chunks // Create a TextDecoder to handle incoming chunks