Compare commits

..

14 Commits

Author SHA1 Message Date
b78168307b 0.7.5
Some checks failed
Default (tags) / security (push) Failing after 24s
Default (tags) / test (push) Failing after 15s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-08 22:56:53 +00:00
bbd8770205 fix(provider.elevenlabs): Update ElevenLabs default TTS model to eleven_v3 and add local Claude permissions file 2025-10-08 22:56:53 +00:00
28bb13dc0c update 2025-10-08 22:49:08 +00:00
3a24c2c4bd 0.7.4
Some checks failed
Default (tags) / security (push) Failing after 21s
Default (tags) / test (push) Failing after 14s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-03 15:47:15 +00:00
8244ac6eb0 fix(provider.anthropic): Use image/png for embedded PDF images in Anthropic provider and add local Claude settings for development permissions 2025-10-03 15:47:15 +00:00
2791d738d6 0.7.3
Some checks failed
Default (tags) / security (push) Failing after 22s
Default (tags) / test (push) Failing after 14s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-03 14:21:25 +00:00
3fbd054985 fix(tests): Add extensive provider/feature tests and local Claude CI permissions 2025-10-03 14:21:25 +00:00
8e8830ef92 0.7.2
Some checks failed
Default (tags) / security (push) Failing after 14s
Default (tags) / test (push) Failing after 14s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-03 13:51:49 +00:00
34931875ad fix(anthropic): Update Anthropic provider branding to Claude Sonnet 4.5 and add local Claude permissions 2025-10-03 13:51:49 +00:00
2672509d3f 0.7.1
Some checks failed
Default (tags) / security (push) Failing after 23s
Default (tags) / test (push) Failing after 13s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-03 13:49:46 +00:00
ee3a635852 fix(docs): Add README image generation docs and .claude local settings 2025-10-03 13:49:46 +00:00
a222b1c2fa 0.7.0
Some checks failed
Default (tags) / security (push) Failing after 24s
Default (tags) / test (push) Failing after 15s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-10-03 13:43:29 +00:00
f0556e89f3 feat(providers): Add research API and image generation/editing support; extend providers and tests 2025-10-03 13:43:29 +00:00
fe8540c8ba feat(research): Implement research APIs. 2025-10-03 12:50:42 +00:00
36 changed files with 3550 additions and 1192 deletions

View File

@@ -1,5 +1,52 @@
# Changelog # Changelog
## 2025-10-08 - 0.7.5 - fix(provider.elevenlabs)
Update ElevenLabs default TTS model to eleven_v3 and add local Claude permissions file
- Changed default ElevenLabs modelId from 'eleven_multilingual_v2' to 'eleven_v3' in ts/provider.elevenlabs.ts to use the newer/default TTS model.
- Added .claude/settings.local.json with a permissions allow-list for local Claude tooling and CI tasks.
## 2025-10-03 - 0.7.4 - fix(provider.anthropic)
Use image/png for embedded PDF images in Anthropic provider and add local Claude settings for development permissions
- AnthropicProvider: change media_type from 'image/jpeg' to 'image/png' when embedding images extracted from PDFs to ensure correct format in Anthropic requests.
- Add .claude/settings.local.json with development/testing permissions for local Claude usage (shell commands, webfetch, websearch, test/run tasks).
## 2025-10-03 - 0.7.3 - fix(tests)
Add extensive provider/feature tests and local Claude CI permissions
- Add many focused test files covering providers and features: OpenAI, Anthropic, Perplexity, Groq, Ollama, Exo, XAI (chat, audio, vision, document, research, image generation, stubs, interfaces, basic)
- Introduce .claude/settings.local.json to declare allowed permissions for local Claude/CI actions
- Replace older aggregated test files with modular per-feature tests (removed legacy combined tests and split into smaller suites)
- No changes to library runtime code — this change adds tests and CI/local agent configuration only
## 2025-10-03 - 0.7.2 - fix(anthropic)
Update Anthropic provider branding to Claude Sonnet 4.5 and add local Claude permissions
- Docs: Replace 'Claude 3 Opus' with 'Claude Sonnet 4.5' in README provider capabilities matrix.
- Config: Add .claude/settings.local.json to define local Claude permissions for tests and development commands.
## 2025-10-03 - 0.7.1 - fix(docs)
Add README image generation docs and .claude local settings
- Add .claude/settings.local.json with permission allow-list for local assistant tooling and web search
- Update README provider capabilities table to include an Images column and reference gpt-image-1
- Add Image Generation & Editing section with examples, options, and gpt-image-1 advantages
- Mark image generation support as implemented in the roadmap and remove duplicate entry
## 2025-10-03 - 0.7.0 - feat(providers)
Add research API and image generation/editing support; extend providers and tests
- Introduce ResearchOptions and ResearchResponse to the MultiModalModel interface and implement research() where supported
- OpenAiProvider: implement research(), add imageGenerate() and imageEdit() methods (gpt-image-1 / DALL·E support), and expose imageModel option
- AnthropicProvider: implement research() and vision handling; explicitly throw for unsupported image generation/editing
- PerplexityProvider: implement research() (sonar / sonar-pro support) and expose citation parsing
- Add image/document-related interfaces (ImageGenerateOptions, ImageEditOptions, ImageResponse) to abstract API
- Add image generation/editing/no-op stubs for other providers (Exo, Groq, Ollama, XAI) that throw informative errors to preserve API compatibility
- Add comprehensive OpenAI image generation tests and helper to save test outputs (test/test.image.openai.ts)
- Update README with Research & Web Search documentation, capability matrix, and roadmap entry for Research & Web Search API
- Add local Claude agent permissions file (.claude/settings.local.json) and various provider type/import updates
## 2025-09-28 - 0.6.1 - fix(provider.anthropic) ## 2025-09-28 - 0.6.1 - fix(provider.anthropic)
Fix Anthropic research tool identifier and add tests + local Claude permissions Fix Anthropic research tool identifier and add tests + local Claude permissions

View File

@@ -1,6 +1,6 @@
{ {
"name": "@push.rocks/smartai", "name": "@push.rocks/smartai",
"version": "0.6.1", "version": "0.7.5",
"private": false, "private": false,
"description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.", "description": "SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.",
"main": "dist_ts/index.js", "main": "dist_ts/index.js",
@@ -15,22 +15,23 @@
"buildDocs": "(tsdoc)" "buildDocs": "(tsdoc)"
}, },
"devDependencies": { "devDependencies": {
"@git.zone/tsbuild": "^2.6.4", "@git.zone/tsbuild": "^2.6.8",
"@git.zone/tsbundle": "^2.5.1", "@git.zone/tsbundle": "^2.5.1",
"@git.zone/tsrun": "^1.3.3", "@git.zone/tsrun": "^1.3.3",
"@git.zone/tstest": "^2.3.2", "@git.zone/tstest": "^2.3.8",
"@push.rocks/qenv": "^6.1.0", "@push.rocks/qenv": "^6.1.3",
"@push.rocks/tapbundle": "^6.0.3", "@push.rocks/tapbundle": "^6.0.3",
"@types/node": "^22.15.17" "@types/node": "^22.15.17",
"typescript": "^5.9.3"
}, },
"dependencies": { "dependencies": {
"@anthropic-ai/sdk": "^0.59.0", "@anthropic-ai/sdk": "^0.65.0",
"@push.rocks/smartarray": "^1.1.0", "@push.rocks/smartarray": "^1.1.0",
"@push.rocks/smartfile": "^11.2.5", "@push.rocks/smartfile": "^11.2.7",
"@push.rocks/smartpath": "^6.0.0", "@push.rocks/smartpath": "^6.0.0",
"@push.rocks/smartpdf": "^4.1.1", "@push.rocks/smartpdf": "^4.1.1",
"@push.rocks/smartpromise": "^4.2.3", "@push.rocks/smartpromise": "^4.2.3",
"@push.rocks/smartrequest": "^4.2.1", "@push.rocks/smartrequest": "^4.3.1",
"@push.rocks/webstream": "^1.0.10", "@push.rocks/webstream": "^1.0.10",
"openai": "^5.12.2" "openai": "^5.12.2"
}, },

2602
pnpm-lock.yaml generated

File diff suppressed because it is too large Load Diff

175
readme.md
View File

@@ -5,7 +5,7 @@
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/) [![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
SmartAI unifies the world's leading AI providers - OpenAI, Anthropic, Perplexity, Ollama, Groq, XAI, and Exo - under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in. SmartAI unifies the world's leading AI providers - OpenAI, Anthropic, Perplexity, Ollama, Groq, XAI, Exo, and ElevenLabs - under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in.
## 🎯 Why SmartAI? ## 🎯 Why SmartAI?
@@ -28,7 +28,11 @@ import { SmartAi } from '@push.rocks/smartai';
// Initialize with your favorite providers // Initialize with your favorite providers
const ai = new SmartAi({ const ai = new SmartAi({
openaiToken: 'sk-...', openaiToken: 'sk-...',
anthropicToken: 'sk-ant-...' anthropicToken: 'sk-ant-...',
elevenlabsToken: 'sk-...',
elevenlabs: {
defaultVoiceId: '19STyYD15bswVz51nqLf' // Optional: Samara voice
}
}); });
await ai.start(); await ai.start();
@@ -45,15 +49,16 @@ const response = await ai.openaiProvider.chat({
Choose the right provider for your use case: Choose the right provider for your use case:
| Provider | Chat | Streaming | TTS | Vision | Documents | Highlights | | Provider | Chat | Streaming | TTS | Vision | Documents | Research | Images | Highlights |
|----------|:----:|:---------:|:---:|:------:|:---------:|------------| |----------|:----:|:---------:|:---:|:------:|:---------:|:--------:|:------:|------------|
| **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | • GPT-4, DALL-E 3<br>• Industry standard<br>• Most features | | **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | • gpt-image-1<br>• DALL-E 3<br>• Deep research API |
| **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | • Claude 3 Opus<br>• Superior reasoning<br>200k context | | **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | • Claude Sonnet 4.5<br>• Superior reasoning<br>Web search API |
| **Ollama** | ✅ | | ❌ | | | • 100% local<br>Privacy-first<br>• No API costs | | **ElevenLabs** | ❌ | ❌ | ✅ | | ❌ | | | • Premium TTS<br>70+ languages<br>• Natural voices |
| **XAI** | ✅ | ✅ | ❌ | | ✅ | • Grok models<br>• Real-time data<br>• Uncensored | | **Ollama** | ✅ | ✅ | ❌ | | ✅ | ❌ | ❌ | • 100% local<br>• Privacy-first<br>• No API costs |
| **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | • Web-aware<br>• Research-focused<br>Citations | | **XAI** | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | • Grok models<br>• Real-time data<br>Uncensored |
| **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | • 10x faster<br>LPU inference<br>• Low latency | | **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |Web-aware<br>Research-focused<br>• Sonar Pro models |
| **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | • Distributed<br>• P2P compute<br>• Decentralized | | **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | • 10x faster<br>• LPU inference<br>• Low latency |
| **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | • Distributed<br>• P2P compute<br>• Decentralized |
## 🎮 Core Features ## 🎮 Core Features
@@ -105,13 +110,27 @@ while (true) {
### 🎙️ Text-to-Speech ### 🎙️ Text-to-Speech
Generate natural voices with OpenAI: Generate natural voices with OpenAI or ElevenLabs:
```typescript ```typescript
// OpenAI TTS
const audioStream = await ai.openaiProvider.audio({ const audioStream = await ai.openaiProvider.audio({
message: 'Welcome to the future of AI development!' message: 'Welcome to the future of AI development!'
}); });
// ElevenLabs TTS - Premium quality, natural voices (uses v3 by default)
const elevenLabsAudio = await ai.elevenlabsProvider.audio({
message: 'Experience the most lifelike text to speech technology.',
voiceId: '19STyYD15bswVz51nqLf', // Optional: Samara voice
modelId: 'eleven_v3', // Optional: defaults to eleven_v3 (70+ languages, most expressive)
voiceSettings: { // Optional: fine-tune voice characteristics
stability: 0.5, // 0-1: Speech consistency
similarity_boost: 0.8, // 0-1: Voice similarity to original
style: 0.0, // 0-1: Expressiveness (higher = more expressive)
use_speaker_boost: true // Enhanced clarity
}
});
// Stream directly to speakers // Stream directly to speakers
audioStream.pipe(speakerOutput); audioStream.pipe(speakerOutput);
@@ -171,6 +190,132 @@ const taxAnalysis = await ai.anthropicProvider.document({
}); });
``` ```
### 🔬 Research & Web Search
Perform deep research with web search capabilities across multiple providers:
```typescript
// OpenAI Deep Research - Comprehensive analysis
const deepResearch = await ai.openaiProvider.research({
query: 'What are the latest developments in quantum computing?',
searchDepth: 'deep',
includeWebSearch: true
});
console.log(deepResearch.answer);
console.log('Sources:', deepResearch.sources);
// Anthropic Web Search - Domain-filtered research
const anthropic = new AnthropicProvider({
anthropicToken: 'sk-ant-...',
enableWebSearch: true,
searchDomainAllowList: ['nature.com', 'science.org']
});
const scientificResearch = await anthropic.research({
query: 'Latest breakthroughs in CRISPR gene editing',
searchDepth: 'advanced'
});
// Perplexity - Research-focused with citations
const perplexityResearch = await ai.perplexityProvider.research({
query: 'Current state of autonomous vehicle technology',
searchDepth: 'deep' // Uses Sonar Pro model
});
```
**Research Options:**
- `searchDepth`: 'basic' | 'advanced' | 'deep'
- `maxSources`: Number of sources to include
- `includeWebSearch`: Enable web search (OpenAI)
- `background`: Run as background task (OpenAI)
**Supported Providers:**
- **OpenAI**: Deep Research API with specialized models (`o3-deep-research-2025-06-26`, `o4-mini-deep-research-2025-06-26`)
- **Anthropic**: Web Search API with domain filtering
- **Perplexity**: Sonar and Sonar Pro models with built-in citations
### 🎨 Image Generation & Editing
Generate and edit images with OpenAI's cutting-edge models:
```typescript
// Basic image generation with gpt-image-1
const image = await ai.openaiProvider.imageGenerate({
prompt: 'A futuristic robot assistant in a modern office, digital art',
model: 'gpt-image-1',
quality: 'high',
size: '1024x1024'
});
// Save the generated image
const imageBuffer = Buffer.from(image.images[0].b64_json!, 'base64');
fs.writeFileSync('robot.png', imageBuffer);
// Advanced: Transparent background with custom format
const logo = await ai.openaiProvider.imageGenerate({
prompt: 'Minimalist mountain peak logo, geometric design',
model: 'gpt-image-1',
quality: 'high',
size: '1024x1024',
background: 'transparent',
outputFormat: 'png'
});
// WebP with compression for web use
const webImage = await ai.openaiProvider.imageGenerate({
prompt: 'Product showcase: sleek smartphone on marble surface',
model: 'gpt-image-1',
quality: 'high',
size: '1536x1024',
outputFormat: 'webp',
outputCompression: 85
});
// Superior text rendering (gpt-image-1's strength)
const signage = await ai.openaiProvider.imageGenerate({
prompt: 'Vintage cafe sign saying "COFFEE & CODE" in hand-lettered typography',
model: 'gpt-image-1',
quality: 'high',
size: '1024x1024'
});
// Generate multiple variations at once
const variations = await ai.openaiProvider.imageGenerate({
prompt: 'Abstract geometric pattern, colorful minimalist art',
model: 'gpt-image-1',
n: 3,
quality: 'medium',
size: '1024x1024'
});
// Edit an existing image
const editedImage = await ai.openaiProvider.imageEdit({
image: originalImageBuffer,
prompt: 'Add sunglasses and change the background to a beach sunset',
model: 'gpt-image-1',
quality: 'high'
});
```
**Image Generation Options:**
- `model`: 'gpt-image-1' | 'dall-e-3' | 'dall-e-2'
- `quality`: 'low' | 'medium' | 'high' | 'auto'
- `size`: Multiple aspect ratios up to 4096×4096
- `background`: 'transparent' | 'opaque' | 'auto'
- `outputFormat`: 'png' | 'jpeg' | 'webp'
- `outputCompression`: 0-100 for webp/jpeg
- `moderation`: 'low' | 'auto'
- `n`: Number of images (1-10)
**gpt-image-1 Advantages:**
- Superior text rendering in images
- Up to 4096×4096 resolution
- Transparent background support
- Advanced output formats (WebP with compression)
- Better prompt understanding
- Streaming support for progressive rendering
### 🔄 Persistent Conversations ### 🔄 Persistent Conversations
Maintain context across interactions: Maintain context across interactions:
@@ -422,6 +567,7 @@ npm install @push.rocks/smartai
export OPENAI_API_KEY=sk-... export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-... export ANTHROPIC_API_KEY=sk-ant-...
export PERPLEXITY_API_KEY=pplx-... export PERPLEXITY_API_KEY=pplx-...
export ELEVENLABS_API_KEY=sk-...
# ... etc # ... etc
``` ```
@@ -447,6 +593,8 @@ export PERPLEXITY_API_KEY=pplx-...
| **General Purpose** | OpenAI | Most features, stable, well-documented | | **General Purpose** | OpenAI | Most features, stable, well-documented |
| **Complex Reasoning** | Anthropic | Superior logical thinking, safer outputs | | **Complex Reasoning** | Anthropic | Superior logical thinking, safer outputs |
| **Research & Facts** | Perplexity | Web-aware, provides citations | | **Research & Facts** | Perplexity | Web-aware, provides citations |
| **Deep Research** | OpenAI | Deep Research API with comprehensive analysis |
| **Premium TTS** | ElevenLabs | Most natural voices, 70+ languages, superior quality (v3) |
| **Speed Critical** | Groq | 10x faster inference, sub-second responses | | **Speed Critical** | Groq | 10x faster inference, sub-second responses |
| **Privacy Critical** | Ollama | 100% local, no data leaves your servers | | **Privacy Critical** | Ollama | 100% local, no data leaves your servers |
| **Real-time Data** | XAI | Access to current information | | **Real-time Data** | XAI | Access to current information |
@@ -454,8 +602,9 @@ export PERPLEXITY_API_KEY=pplx-...
## 📈 Roadmap ## 📈 Roadmap
- [x] Research & Web Search API
- [x] Image generation support (gpt-image-1, DALL-E 3, DALL-E 2)
- [ ] Streaming function calls - [ ] Streaming function calls
- [ ] Image generation support
- [ ] Voice input processing - [ ] Voice input processing
- [ ] Fine-tuning integration - [ ] Fine-tuning integration
- [ ] Embedding support - [ ] Embedding support

View File

@@ -1,177 +0,0 @@
# SmartAI Research API Implementation
This document describes the new research capabilities added to the SmartAI library, enabling web search and deep research features for OpenAI and Anthropic providers.
## Features Added
### 1. Research Method Interface
Added a new `research()` method to the `MultiModalModel` abstract class with the following interfaces:
```typescript
interface ResearchOptions {
query: string;
searchDepth?: 'basic' | 'advanced' | 'deep';
maxSources?: number;
includeWebSearch?: boolean;
background?: boolean;
}
interface ResearchResponse {
answer: string;
sources: Array<{
url: string;
title: string;
snippet: string;
}>;
searchQueries?: string[];
metadata?: any;
}
```
### 2. OpenAI Provider Research Implementation
The OpenAI provider now supports:
- **Deep Research API** with models:
- `o3-deep-research-2025-06-26` (comprehensive analysis)
- `o4-mini-deep-research-2025-06-26` (lightweight, faster)
- **Web Search** for standard models (gpt-5, o3, o3-pro, o4-mini)
- **Background processing** for async deep research tasks
### 3. Anthropic Provider Research Implementation
The Anthropic provider now supports:
- **Web Search API** with Claude models
- **Domain filtering** (allow/block lists)
- **Progressive searches** for comprehensive research
- **Citation extraction** from responses
### 4. Perplexity Provider Research Implementation
The Perplexity provider implements research using:
- **Sonar models** for standard searches
- **Sonar Pro** for deep research
- Built-in citation support
### 5. Other Providers
Added research method stubs to:
- Groq Provider
- Ollama Provider
- xAI Provider
- Exo Provider
These providers throw a "not yet supported" error when research is called, maintaining interface compatibility.
## Usage Examples
### Basic Research with OpenAI
```typescript
import { OpenAiProvider } from '@push.rocks/smartai';
const openai = new OpenAiProvider({
openaiToken: 'your-api-key',
researchModel: 'o4-mini-deep-research-2025-06-26'
});
await openai.start();
const result = await openai.research({
query: 'What are the latest developments in quantum computing?',
searchDepth: 'basic',
includeWebSearch: true
});
console.log(result.answer);
console.log('Sources:', result.sources);
```
### Deep Research with OpenAI
```typescript
const deepResult = await openai.research({
query: 'Comprehensive analysis of climate change mitigation strategies',
searchDepth: 'deep',
background: true
});
```
### Research with Anthropic
```typescript
import { AnthropicProvider } from '@push.rocks/smartai';
const anthropic = new AnthropicProvider({
anthropicToken: 'your-api-key',
enableWebSearch: true,
searchDomainAllowList: ['nature.com', 'science.org']
});
await anthropic.start();
const result = await anthropic.research({
query: 'Latest breakthroughs in CRISPR gene editing',
searchDepth: 'advanced'
});
```
### Research with Perplexity
```typescript
import { PerplexityProvider } from '@push.rocks/smartai';
const perplexity = new PerplexityProvider({
perplexityToken: 'your-api-key'
});
const result = await perplexity.research({
query: 'Current state of autonomous vehicle technology',
searchDepth: 'deep' // Uses Sonar Pro model
});
```
## Configuration Options
### OpenAI Provider
- `researchModel`: Specify deep research model (default: `o4-mini-deep-research-2025-06-26`)
- `enableWebSearch`: Enable web search for standard models
### Anthropic Provider
- `enableWebSearch`: Enable web search capabilities
- `searchDomainAllowList`: Array of allowed domains
- `searchDomainBlockList`: Array of blocked domains
## API Pricing
- **OpenAI Deep Research**: $10 per 1,000 calls
- **Anthropic Web Search**: $10 per 1,000 searches + standard token costs
- **Perplexity Sonar**: $5 per 1,000 searches (Sonar Pro)
## Testing
Run the test suite:
```bash
pnpm test test/test.research.ts
```
All providers have been tested to ensure:
- Research methods are properly exposed
- Interfaces are correctly typed
- Unsupported providers throw appropriate errors
## Next Steps
Future enhancements could include:
1. Implementing Google Gemini Grounding API support
2. Adding Brave Search API integration
3. Implementing retry logic for rate limits
4. Adding caching for repeated queries
5. Supporting batch research operations
## Notes
- The implementation maintains backward compatibility
- All existing methods continue to work unchanged
- Research capabilities are optional and don't affect existing functionality

View File

@@ -1,160 +0,0 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartrequest from '@push.rocks/smartrequest';
import * as smartfile from '@push.rocks/smartfile';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let anthropicProvider: smartai.AnthropicProvider;
tap.test('Anthropic: should create and start Anthropic provider', async () => {
anthropicProvider = new smartai.AnthropicProvider({
anthropicToken: await testQenv.getEnvVarOnDemand('ANTHROPIC_TOKEN'),
});
await anthropicProvider.start();
expect(anthropicProvider).toBeInstanceOf(smartai.AnthropicProvider);
});
tap.test('Anthropic: should create chat response', async () => {
const userMessage = 'What is the capital of France? Answer in one word.';
const response = await anthropicProvider.chat({
systemMessage: 'You are a helpful assistant. Be concise.',
userMessage: userMessage,
messageHistory: [],
});
console.log(`Anthropic Chat - User: ${userMessage}`);
console.log(`Anthropic Chat - Response: ${response.message}`);
expect(response.role).toEqual('assistant');
expect(response.message).toBeTruthy();
expect(response.message.toLowerCase()).toInclude('paris');
});
tap.test('Anthropic: should handle message history', async () => {
const messageHistory: smartai.ChatMessage[] = [
{ role: 'user', content: 'My name is Claude Test' },
{ role: 'assistant', content: 'Nice to meet you, Claude Test!' }
];
const response = await anthropicProvider.chat({
systemMessage: 'You are a helpful assistant with good memory.',
userMessage: 'What is my name?',
messageHistory: messageHistory,
});
console.log(`Anthropic Memory Test - Response: ${response.message}`);
expect(response.message.toLowerCase()).toInclude('claude test');
});
tap.test('Anthropic: should process vision tasks', async () => {
// Create a simple test image (1x1 red pixel JPEG)
// This is a valid 1x1 JPEG image
const redPixelBase64 = '/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAABAAEDASIAAhEBAxEB/8QAFQABAQAAAAAAAAAAAAAAAAAAAAv/xAAUEAEAAAAAAAAAAAAAAAAAAAAA/8QAFQEBAQAAAAAAAAAAAAAAAAAAAAX/xAAUEQEAAAAAAAAAAAAAAAAAAAAA/9oADAMBAAIRAxEAPwCwAA8A/9k=';
const imageBuffer = Buffer.from(redPixelBase64, 'base64');
const result = await anthropicProvider.vision({
image: imageBuffer,
prompt: 'What color is this image? Answer with just the color name.'
});
console.log(`Anthropic Vision - Result: ${result}`);
expect(result).toBeTruthy();
expect(typeof result).toEqual('string');
});
tap.test('Anthropic: should document a PDF', async () => {
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl)
.get();
const result = await anthropicProvider.document({
systemMessage: 'Classify the document. Only the following answers are allowed: "invoice", "bank account statement", "contract", "test document", "other". The answer should only contain the keyword for machine use.',
userMessage: 'Classify this document.',
messageHistory: [],
pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())],
});
console.log(`Anthropic Document - Result:`, result);
expect(result).toBeTruthy();
expect(result.message).toBeTruthy();
});
tap.test('Anthropic: should handle complex document analysis', async () => {
// Test with the demo PDF if it exists
const pdfPath = './.nogit/demo_without_textlayer.pdf';
let pdfBuffer: Uint8Array;
try {
pdfBuffer = await smartfile.fs.toBuffer(pdfPath);
} catch (error) {
// If the file doesn't exist, use the dummy PDF
console.log('Demo PDF not found, using dummy PDF instead');
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl)
.get();
pdfBuffer = Buffer.from(await pdfResponse.arrayBuffer());
}
const result = await anthropicProvider.document({
systemMessage: `
Analyze this document and provide a JSON response with the following structure:
{
"documentType": "string",
"hasText": boolean,
"summary": "string"
}
`,
userMessage: 'Analyze this document.',
messageHistory: [],
pdfDocuments: [pdfBuffer],
});
console.log(`Anthropic Complex Document Analysis:`, result);
expect(result).toBeTruthy();
expect(result.message).toBeTruthy();
});
tap.test('Anthropic: should handle errors gracefully', async () => {
// Test with invalid message (empty)
let errorCaught = false;
try {
await anthropicProvider.chat({
systemMessage: '',
userMessage: '',
messageHistory: [],
});
} catch (error) {
errorCaught = true;
console.log('Expected error caught:', error.message);
}
// Anthropic might handle empty messages, so we don't assert error
console.log(`Error handling test - Error caught: ${errorCaught}`);
});
tap.test('Anthropic: audio should throw not supported error', async () => {
let errorCaught = false;
try {
await anthropicProvider.audio({
message: 'This should fail'
});
} catch (error) {
errorCaught = true;
expect(error.message).toInclude('not yet supported');
}
expect(errorCaught).toBeTrue();
});
tap.test('Anthropic: should stop the provider', async () => {
await anthropicProvider.stop();
console.log('Anthropic provider stopped successfully');
});
export default tap.start();

View File

@@ -0,0 +1,54 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartfile from '@push.rocks/smartfile';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let testSmartai: smartai.SmartAi;
tap.test('ElevenLabs Audio: should create a smartai instance with ElevenLabs provider', async () => {
testSmartai = new smartai.SmartAi({
elevenlabsToken: await testQenv.getEnvVarOnDemand('ELEVENLABS_TOKEN'),
elevenlabs: {
defaultVoiceId: '19STyYD15bswVz51nqLf',
},
});
await testSmartai.start();
});
tap.test('ElevenLabs Audio: should create audio response', async () => {
const audioStream = await testSmartai.elevenlabsProvider.audio({
message: 'Welcome to SmartAI, the unified interface for the world\'s leading artificial intelligence providers. SmartAI brings together OpenAI, Anthropic, Perplexity, and ElevenLabs under a single elegant TypeScript API. Whether you need text generation, vision analysis, document processing, or premium text-to-speech capabilities, SmartAI provides a consistent and powerful interface for all your AI needs. Build intelligent applications at lightning speed without vendor lock-in.',
});
const chunks: Uint8Array[] = [];
for await (const chunk of audioStream) {
chunks.push(chunk as Uint8Array);
}
const audioBuffer = Buffer.concat(chunks);
await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput_elevenlabs.mp3');
console.log(`Audio Buffer length: ${audioBuffer.length}`);
expect(audioBuffer.length).toBeGreaterThan(0);
});
tap.test('ElevenLabs Audio: should create audio with custom voice', async () => {
const audioStream = await testSmartai.elevenlabsProvider.audio({
message: 'Testing with a different voice.',
voiceId: 'JBFqnCBsd6RMkjVDRZzb',
});
const chunks: Uint8Array[] = [];
for await (const chunk of audioStream) {
chunks.push(chunk as Uint8Array);
}
const audioBuffer = Buffer.concat(chunks);
await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput_elevenlabs_custom.mp3');
console.log(`Audio Buffer length (custom voice): ${audioBuffer.length}`);
expect(audioBuffer.length).toBeGreaterThan(0);
});
tap.test('ElevenLabs Audio: should stop the smartai instance', async () => {
await testSmartai.stop();
});
export default tap.start();

39
test/test.audio.openai.ts Normal file
View File

@@ -0,0 +1,39 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartfile from '@push.rocks/smartfile';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let testSmartai: smartai.SmartAi;
tap.test('OpenAI Audio: should create a smartai instance with OpenAI provider', async () => {
testSmartai = new smartai.SmartAi({
openaiToken: await testQenv.getEnvVarOnDemand('OPENAI_TOKEN'),
});
await testSmartai.start();
});
tap.test('OpenAI Audio: should create audio response', async () => {
// Call the audio method with a sample message.
const audioStream = await testSmartai.openaiProvider.audio({
message: 'This is a test of audio generation.',
});
// Read all chunks from the stream.
const chunks: Uint8Array[] = [];
for await (const chunk of audioStream) {
chunks.push(chunk as Uint8Array);
}
const audioBuffer = Buffer.concat(chunks);
await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput.mp3');
console.log(`Audio Buffer length: ${audioBuffer.length}`);
// Assert that the resulting buffer is not empty.
expect(audioBuffer.length).toBeGreaterThan(0);
});
tap.test('OpenAI Audio: should stop the smartai instance', async () => {
await testSmartai.stop();
});
export default tap.start();

36
test/test.audio.stubs.ts Normal file
View File

@@ -0,0 +1,36 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let anthropicProvider: smartai.AnthropicProvider;
tap.test('Audio Stubs: should create Anthropic provider', async () => {
anthropicProvider = new smartai.AnthropicProvider({
anthropicToken: await testQenv.getEnvVarOnDemand('ANTHROPIC_TOKEN'),
});
await anthropicProvider.start();
});
tap.test('Audio Stubs: Anthropic audio should throw not supported error', async () => {
let errorCaught = false;
try {
await anthropicProvider.audio({
message: 'This should fail'
});
} catch (error) {
errorCaught = true;
expect(error.message).toInclude('not yet supported');
}
expect(errorCaught).toBeTrue();
});
tap.test('Audio Stubs: should stop Anthropic provider', async () => {
await anthropicProvider.stop();
});
export default tap.start();

View File

@@ -0,0 +1,72 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let anthropicProvider: smartai.AnthropicProvider;
tap.test('Anthropic Chat: should create and start Anthropic provider', async () => {
anthropicProvider = new smartai.AnthropicProvider({
anthropicToken: await testQenv.getEnvVarOnDemand('ANTHROPIC_TOKEN'),
});
await anthropicProvider.start();
expect(anthropicProvider).toBeInstanceOf(smartai.AnthropicProvider);
});
tap.test('Anthropic Chat: should create chat response', async () => {
const userMessage = 'What is the capital of France? Answer in one word.';
const response = await anthropicProvider.chat({
systemMessage: 'You are a helpful assistant. Be concise.',
userMessage: userMessage,
messageHistory: [],
});
console.log(`Anthropic Chat - User: ${userMessage}`);
console.log(`Anthropic Chat - Response: ${response.message}`);
expect(response.role).toEqual('assistant');
expect(response.message).toBeTruthy();
expect(response.message.toLowerCase()).toInclude('paris');
});
tap.test('Anthropic Chat: should handle message history', async () => {
const messageHistory: smartai.ChatMessage[] = [
{ role: 'user', content: 'My name is Claude Test' },
{ role: 'assistant', content: 'Nice to meet you, Claude Test!' }
];
const response = await anthropicProvider.chat({
systemMessage: 'You are a helpful assistant with good memory.',
userMessage: 'What is my name?',
messageHistory: messageHistory,
});
console.log(`Anthropic Memory Test - Response: ${response.message}`);
expect(response.message.toLowerCase()).toInclude('claude test');
});
tap.test('Anthropic Chat: should handle errors gracefully', async () => {
// Test with invalid message (empty)
let errorCaught = false;
try {
await anthropicProvider.chat({
systemMessage: '',
userMessage: '',
messageHistory: [],
});
} catch (error) {
errorCaught = true;
console.log('Expected error caught:', error.message);
}
// Anthropic might handle empty messages, so we don't assert error
console.log(`Error handling test - Error caught: ${errorCaught}`);
});
tap.test('Anthropic Chat: should stop the provider', async () => {
await anthropicProvider.stop();
});
export default tap.start();

34
test/test.chat.openai.ts Normal file
View File

@@ -0,0 +1,34 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let testSmartai: smartai.SmartAi;
tap.test('OpenAI Chat: should create a smartai instance with OpenAI provider', async () => {
testSmartai = new smartai.SmartAi({
openaiToken: await testQenv.getEnvVarOnDemand('OPENAI_TOKEN'),
});
await testSmartai.start();
});
tap.test('OpenAI Chat: should create chat response', async () => {
const userMessage = 'How are you?';
const response = await testSmartai.openaiProvider.chat({
systemMessage: 'Hello',
userMessage: userMessage,
messageHistory: [],
});
console.log(`userMessage: ${userMessage}`);
console.log(response.message);
expect(response.role).toEqual('assistant');
expect(response.message).toBeTruthy();
});
tap.test('OpenAI Chat: should stop the smartai instance', async () => {
await testSmartai.stop();
});
export default tap.start();

View File

@@ -0,0 +1,78 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartrequest from '@push.rocks/smartrequest';
import * as smartfile from '@push.rocks/smartfile';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let anthropicProvider: smartai.AnthropicProvider;
tap.test('Anthropic Document: should create and start Anthropic provider', async () => {
anthropicProvider = new smartai.AnthropicProvider({
anthropicToken: await testQenv.getEnvVarOnDemand('ANTHROPIC_TOKEN'),
});
await anthropicProvider.start();
expect(anthropicProvider).toBeInstanceOf(smartai.AnthropicProvider);
});
tap.test('Anthropic Document: should document a PDF', async () => {
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl)
.get();
const result = await anthropicProvider.document({
systemMessage: 'Classify the document. Only the following answers are allowed: "invoice", "bank account statement", "contract", "test document", "other". The answer should only contain the keyword for machine use.',
userMessage: 'Classify this document.',
messageHistory: [],
pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())],
});
console.log(`Anthropic Document - Result:`, result);
expect(result).toBeTruthy();
expect(result.message).toBeTruthy();
});
tap.test('Anthropic Document: should handle complex document analysis', async () => {
// Test with the demo PDF if it exists
const pdfPath = './.nogit/demo_without_textlayer.pdf';
let pdfBuffer: Uint8Array;
try {
pdfBuffer = await smartfile.fs.toBuffer(pdfPath);
} catch (error) {
// If the file doesn't exist, use the dummy PDF
console.log('Demo PDF not found, using dummy PDF instead');
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl)
.get();
pdfBuffer = Buffer.from(await pdfResponse.arrayBuffer());
}
const result = await anthropicProvider.document({
systemMessage: `
Analyze this document and provide a JSON response with the following structure:
{
"documentType": "string",
"hasText": boolean,
"summary": "string"
}
`,
userMessage: 'Analyze this document.',
messageHistory: [],
pdfDocuments: [pdfBuffer],
});
console.log(`Anthropic Complex Document Analysis:`, result);
expect(result).toBeTruthy();
expect(result.message).toBeTruthy();
});
tap.test('Anthropic Document: should stop the provider', async () => {
await anthropicProvider.stop();
});
export default tap.start();

View File

@@ -9,25 +9,14 @@ import * as smartai from '../ts/index.js';
let testSmartai: smartai.SmartAi; let testSmartai: smartai.SmartAi;
tap.test('OpenAI: should create a smartai instance with OpenAI provider', async () => { tap.test('OpenAI Document: should create a smartai instance with OpenAI provider', async () => {
testSmartai = new smartai.SmartAi({ testSmartai = new smartai.SmartAi({
openaiToken: await testQenv.getEnvVarOnDemand('OPENAI_TOKEN'), openaiToken: await testQenv.getEnvVarOnDemand('OPENAI_TOKEN'),
}); });
await testSmartai.start(); await testSmartai.start();
}); });
tap.test('OpenAI: should create chat response', async () => { tap.test('OpenAI Document: should document a pdf', async () => {
const userMessage = 'How are you?';
const response = await testSmartai.openaiProvider.chat({
systemMessage: 'Hello',
userMessage: userMessage,
messageHistory: [],
});
console.log(`userMessage: ${userMessage}`);
console.log(response.message);
});
tap.test('OpenAI: should document a pdf', async () => {
const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf'; const pdfUrl = 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf';
const pdfResponse = await smartrequest.SmartRequest.create() const pdfResponse = await smartrequest.SmartRequest.create()
.url(pdfUrl) .url(pdfUrl)
@@ -39,9 +28,10 @@ tap.test('OpenAI: should document a pdf', async () => {
pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())], pdfDocuments: [Buffer.from(await pdfResponse.arrayBuffer())],
}); });
console.log(result); console.log(result);
expect(result.message).toBeTruthy();
}); });
tap.test('OpenAI: should recognize companies in a pdf', async () => { tap.test('OpenAI Document: should recognize companies in a pdf', async () => {
const pdfBuffer = await smartfile.fs.toBuffer('./.nogit/demo_without_textlayer.pdf'); const pdfBuffer = await smartfile.fs.toBuffer('./.nogit/demo_without_textlayer.pdf');
const result = await testSmartai.openaiProvider.document({ const result = await testSmartai.openaiProvider.document({
systemMessage: ` systemMessage: `
@@ -76,26 +66,10 @@ tap.test('OpenAI: should recognize companies in a pdf', async () => {
pdfDocuments: [pdfBuffer], pdfDocuments: [pdfBuffer],
}); });
console.log(result); console.log(result);
expect(result.message).toBeTruthy();
}); });
tap.test('OpenAI: should create audio response', async () => { tap.test('OpenAI Document: should stop the smartai instance', async () => {
// Call the audio method with a sample message.
const audioStream = await testSmartai.openaiProvider.audio({
message: 'This is a test of audio generation.',
});
// Read all chunks from the stream.
const chunks: Uint8Array[] = [];
for await (const chunk of audioStream) {
chunks.push(chunk as Uint8Array);
}
const audioBuffer = Buffer.concat(chunks);
await smartfile.fs.toFs(audioBuffer, './.nogit/testoutput.mp3');
console.log(`Audio Buffer length: ${audioBuffer.length}`);
// Assert that the resulting buffer is not empty.
expect(audioBuffer.length).toBeGreaterThan(0);
});
tap.test('OpenAI: should stop the smartai instance', async () => {
await testSmartai.stop(); await testSmartai.stop();
}); });

203
test/test.image.openai.ts Normal file
View File

@@ -0,0 +1,203 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartai from '../ts/index.js';
import * as path from 'path';
import { promises as fs } from 'fs';
const testQenv = new qenv.Qenv('./', './.nogit/');
let openaiProvider: smartai.OpenAiProvider;
// Helper function to save image results
async function saveImageResult(testName: string, result: any) {
const sanitizedName = testName.replace(/[^a-z0-9]/gi, '_').toLowerCase();
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filename = `openai_${sanitizedName}_${timestamp}.json`;
const filepath = path.join('.nogit', 'testresults', 'images', filename);
await fs.mkdir(path.dirname(filepath), { recursive: true });
await fs.writeFile(filepath, JSON.stringify(result, null, 2), 'utf-8');
console.log(` 💾 Saved to: ${filepath}`);
// Also save the actual image if b64_json is present
if (result.images && result.images[0]?.b64_json) {
const imageFilename = `openai_${sanitizedName}_${timestamp}.png`;
const imageFilepath = path.join('.nogit', 'testresults', 'images', imageFilename);
await fs.writeFile(imageFilepath, Buffer.from(result.images[0].b64_json, 'base64'));
console.log(` 🖼️ Image saved to: ${imageFilepath}`);
}
}
tap.test('OpenAI Image Generation: should initialize provider', async () => {
const openaiToken = await testQenv.getEnvVarOnDemand('OPENAI_TOKEN');
expect(openaiToken).toBeTruthy();
openaiProvider = new smartai.OpenAiProvider({
openaiToken,
imageModel: 'gpt-image-1'
});
await openaiProvider.start();
expect(openaiProvider).toBeInstanceOf(smartai.OpenAiProvider);
});
tap.test('OpenAI Image: Basic generation with gpt-image-1', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'A cute robot reading a book in a cozy library, digital art style',
model: 'gpt-image-1',
quality: 'medium',
size: '1024x1024'
});
console.log('Basic gpt-image-1 Generation:');
console.log('- Images generated:', result.images.length);
console.log('- Model used:', result.metadata?.model);
console.log('- Quality:', result.metadata?.quality);
console.log('- Size:', result.metadata?.size);
console.log('- Tokens used:', result.metadata?.tokensUsed);
await saveImageResult('basic_generation_gptimage1', result);
expect(result.images).toBeTruthy();
expect(result.images.length).toEqual(1);
expect(result.images[0].b64_json).toBeTruthy();
expect(result.metadata?.model).toEqual('gpt-image-1');
});
tap.test('OpenAI Image: High quality with transparent background', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'A simple geometric logo of a mountain peak, minimal design, clean lines',
model: 'gpt-image-1',
quality: 'high',
size: '1024x1024',
background: 'transparent',
outputFormat: 'png'
});
console.log('High Quality Transparent:');
console.log('- Quality:', result.metadata?.quality);
console.log('- Background: transparent');
console.log('- Format:', result.metadata?.outputFormat);
console.log('- Tokens used:', result.metadata?.tokensUsed);
await saveImageResult('high_quality_transparent', result);
expect(result.images.length).toEqual(1);
expect(result.images[0].b64_json).toBeTruthy();
});
tap.test('OpenAI Image: WebP format with compression', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'A futuristic cityscape at sunset with flying cars, photorealistic',
model: 'gpt-image-1',
quality: 'high',
size: '1536x1024',
outputFormat: 'webp',
outputCompression: 85
});
console.log('WebP with Compression:');
console.log('- Format:', result.metadata?.outputFormat);
console.log('- Compression: 85%');
console.log('- Size:', result.metadata?.size);
await saveImageResult('webp_compression', result);
expect(result.images.length).toEqual(1);
expect(result.images[0].b64_json).toBeTruthy();
});
tap.test('OpenAI Image: Text rendering with gpt-image-1', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'A vintage cafe sign that says "COFFEE & CODE" in elegant hand-lettered typography, warm colors',
model: 'gpt-image-1',
quality: 'high',
size: '1024x1024'
});
console.log('Text Rendering:');
console.log('- Prompt includes text: "COFFEE & CODE"');
console.log('- gpt-image-1 has superior text rendering');
console.log('- Tokens used:', result.metadata?.tokensUsed);
await saveImageResult('text_rendering', result);
expect(result.images.length).toEqual(1);
expect(result.images[0].b64_json).toBeTruthy();
});
tap.test('OpenAI Image: Multiple images generation', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'Abstract colorful geometric patterns, modern minimalist art',
model: 'gpt-image-1',
n: 2,
quality: 'medium',
size: '1024x1024'
});
console.log('Multiple Images:');
console.log('- Images requested: 2');
console.log('- Images generated:', result.images.length);
await saveImageResult('multiple_images', result);
expect(result.images.length).toEqual(2);
expect(result.images[0].b64_json).toBeTruthy();
expect(result.images[1].b64_json).toBeTruthy();
});
tap.test('OpenAI Image: Low moderation setting', async () => {
const result = await openaiProvider.imageGenerate({
prompt: 'A fantasy battle scene with warriors and dragons',
model: 'gpt-image-1',
moderation: 'low',
quality: 'medium'
});
console.log('Low Moderation:');
console.log('- Moderation: low (less restrictive filtering)');
console.log('- Tokens used:', result.metadata?.tokensUsed);
await saveImageResult('low_moderation', result);
expect(result.images.length).toEqual(1);
expect(result.images[0].b64_json).toBeTruthy();
});
tap.test('OpenAI Image Editing: edit with gpt-image-1', async () => {
// First, generate a base image
const baseResult = await openaiProvider.imageGenerate({
prompt: 'A simple white cat sitting on a red cushion',
model: 'gpt-image-1',
quality: 'low',
size: '1024x1024'
});
const baseImageBuffer = Buffer.from(baseResult.images[0].b64_json!, 'base64');
// Now edit it
const editResult = await openaiProvider.imageEdit({
image: baseImageBuffer,
prompt: 'Change the cat to orange and add stylish sunglasses',
model: 'gpt-image-1',
quality: 'medium'
});
console.log('Image Editing:');
console.log('- Base image created');
console.log('- Edit: change color and add sunglasses');
console.log('- Result images:', editResult.images.length);
await saveImageResult('image_edit', editResult);
expect(editResult.images.length).toEqual(1);
expect(editResult.images[0].b64_json).toBeTruthy();
});
tap.test('OpenAI Image: should clean up provider', async () => {
await openaiProvider.stop();
console.log('OpenAI image provider stopped successfully');
});
export default tap.start();

View File

@@ -1,9 +1,24 @@
import { expect, tap } from '@push.rocks/tapbundle'; import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv'; import * as qenv from '@push.rocks/qenv';
import * as smartai from '../ts/index.js'; import * as smartai from '../ts/index.js';
import * as path from 'path';
import { promises as fs } from 'fs';
const testQenv = new qenv.Qenv('./', './.nogit/'); const testQenv = new qenv.Qenv('./', './.nogit/');
// Helper function to save research results
async function saveResearchResult(testName: string, result: any) {
const sanitizedName = testName.replace(/[^a-z0-9]/gi, '_').toLowerCase();
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filename = `${sanitizedName}_${timestamp}.json`;
const filepath = path.join('.nogit', 'testresults', 'research', filename);
await fs.mkdir(path.dirname(filepath), { recursive: true });
await fs.writeFile(filepath, JSON.stringify(result, null, 2), 'utf-8');
console.log(` 💾 Saved to: ${filepath}`);
}
let anthropicProvider: smartai.AnthropicProvider; let anthropicProvider: smartai.AnthropicProvider;
tap.test('Anthropic Research: should initialize provider with web search', async () => { tap.test('Anthropic Research: should initialize provider with web search', async () => {
@@ -28,6 +43,8 @@ tap.test('Anthropic Research: should perform basic research query', async () =>
console.log('- Sources found:', result.sources.length); console.log('- Sources found:', result.sources.length);
console.log('- First 200 chars:', result.answer.substring(0, 200)); console.log('- First 200 chars:', result.answer.substring(0, 200));
await saveResearchResult('basic_research_machine_learning', result);
expect(result).toBeTruthy(); expect(result).toBeTruthy();
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.toLowerCase()).toInclude('machine learning'); expect(result.answer.toLowerCase()).toInclude('machine learning');
@@ -50,6 +67,8 @@ tap.test('Anthropic Research: should perform research with web search', async ()
console.log('- Search queries:', result.searchQueries); console.log('- Search queries:', result.searchQueries);
} }
await saveResearchResult('web_search_renewable_energy', result);
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.toLowerCase()).toInclude('renewable'); expect(result.answer.toLowerCase()).toInclude('renewable');
@@ -70,6 +89,8 @@ tap.test('Anthropic Research: should handle deep research queries', async () =>
console.log('- Answer length:', result.answer.length); console.log('- Answer length:', result.answer.length);
console.log('- Token usage:', result.metadata?.tokensUsed); console.log('- Token usage:', result.metadata?.tokensUsed);
await saveResearchResult('deep_research_rest_vs_graphql', result);
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.length).toBeGreaterThan(300); expect(result.answer.length).toBeGreaterThan(300);
expect(result.answer.toLowerCase()).toInclude('rest'); expect(result.answer.toLowerCase()).toInclude('rest');
@@ -87,6 +108,8 @@ tap.test('Anthropic Research: should extract citations from response', async ()
console.log('- Sources found:', result.sources.length); console.log('- Sources found:', result.sources.length);
console.log('- Answer includes Docker:', result.answer.toLowerCase().includes('docker')); console.log('- Answer includes Docker:', result.answer.toLowerCase().includes('docker'));
await saveResearchResult('citation_extraction_docker', result);
expect(result.answer).toInclude('Docker'); expect(result.answer).toInclude('Docker');
// Check for URL extraction (both markdown and plain URLs) // Check for URL extraction (both markdown and plain URLs)
@@ -114,6 +137,8 @@ tap.test('Anthropic Research: should use domain filtering when configured', asyn
console.log('- Answer length:', result.answer.length); console.log('- Answer length:', result.answer.length);
console.log('- Applied domain filters (allow: wikipedia, docs.microsoft)'); console.log('- Applied domain filters (allow: wikipedia, docs.microsoft)');
await saveResearchResult('domain_filtering_javascript', result);
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.toLowerCase()).toInclude('javascript'); expect(result.answer.toLowerCase()).toInclude('javascript');
@@ -156,6 +181,9 @@ tap.test('Anthropic Research: should handle different search depths', async () =
console.log('- Basic tokens:', basicResult.metadata?.tokensUsed); console.log('- Basic tokens:', basicResult.metadata?.tokensUsed);
console.log('- Advanced tokens:', advancedResult.metadata?.tokensUsed); console.log('- Advanced tokens:', advancedResult.metadata?.tokensUsed);
await saveResearchResult('search_depth_python_basic', basicResult);
await saveResearchResult('search_depth_python_advanced', advancedResult);
expect(basicResult.answer).toBeTruthy(); expect(basicResult.answer).toBeTruthy();
expect(advancedResult.answer).toBeTruthy(); expect(advancedResult.answer).toBeTruthy();
@@ -165,6 +193,28 @@ tap.test('Anthropic Research: should handle different search depths', async () =
expect(advancedResult.answer.toLowerCase()).toInclude('python'); expect(advancedResult.answer.toLowerCase()).toInclude('python');
}); });
tap.test('Anthropic Research: ARM vs. Qualcomm comparison', async () => {
const result = await anthropicProvider.research({
query: 'Compare ARM and Qualcomm: their technologies, market positions, and recent developments in the mobile and computing sectors',
searchDepth: 'advanced',
includeWebSearch: true,
maxSources: 10
});
console.log('ARM vs. Qualcomm Research:');
console.log('- Answer length:', result.answer.length);
console.log('- Sources found:', result.sources.length);
console.log('- First 300 chars:', result.answer.substring(0, 300));
await saveResearchResult('arm_vs_qualcomm_comparison', result);
expect(result.answer).toBeTruthy();
expect(result.answer.length).toBeGreaterThan(500);
expect(result.answer.toLowerCase()).toInclude('arm');
expect(result.answer.toLowerCase()).toInclude('qualcomm');
expect(result.sources.length).toBeGreaterThan(0);
});
tap.test('Anthropic Research: should clean up provider', async () => { tap.test('Anthropic Research: should clean up provider', async () => {
await anthropicProvider.stop(); await anthropicProvider.stop();
console.log('Anthropic research provider stopped successfully'); console.log('Anthropic research provider stopped successfully');

View File

@@ -1,9 +1,24 @@
import { expect, tap } from '@push.rocks/tapbundle'; import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv'; import * as qenv from '@push.rocks/qenv';
import * as smartai from '../ts/index.js'; import * as smartai from '../ts/index.js';
import * as path from 'path';
import { promises as fs } from 'fs';
const testQenv = new qenv.Qenv('./', './.nogit/'); const testQenv = new qenv.Qenv('./', './.nogit/');
// Helper function to save research results
async function saveResearchResult(testName: string, result: any) {
const sanitizedName = testName.replace(/[^a-z0-9]/gi, '_').toLowerCase();
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const filename = `openai_${sanitizedName}_${timestamp}.json`;
const filepath = path.join('.nogit', 'testresults', 'research', filename);
await fs.mkdir(path.dirname(filepath), { recursive: true });
await fs.writeFile(filepath, JSON.stringify(result, null, 2), 'utf-8');
console.log(` 💾 Saved to: ${filepath}`);
}
let openaiProvider: smartai.OpenAiProvider; let openaiProvider: smartai.OpenAiProvider;
tap.test('OpenAI Research: should initialize provider with research capabilities', async () => { tap.test('OpenAI Research: should initialize provider with research capabilities', async () => {
@@ -29,6 +44,8 @@ tap.test('OpenAI Research: should perform basic research query', async () => {
console.log('- Sources found:', result.sources.length); console.log('- Sources found:', result.sources.length);
console.log('- First 200 chars:', result.answer.substring(0, 200)); console.log('- First 200 chars:', result.answer.substring(0, 200));
await saveResearchResult('basic_research_typescript', result);
expect(result).toBeTruthy(); expect(result).toBeTruthy();
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.toLowerCase()).toInclude('typescript'); expect(result.answer.toLowerCase()).toInclude('typescript');
@@ -52,6 +69,8 @@ tap.test('OpenAI Research: should perform research with web search enabled', asy
console.log('- Search queries used:', result.searchQueries); console.log('- Search queries used:', result.searchQueries);
} }
await saveResearchResult('web_search_ecmascript', result);
expect(result.answer).toBeTruthy(); expect(result.answer).toBeTruthy();
expect(result.answer.toLowerCase()).toInclude('ecmascript'); expect(result.answer.toLowerCase()).toInclude('ecmascript');
@@ -98,6 +117,8 @@ tap.test('OpenAI Research: should extract sources from markdown links', async ()
console.log('OpenAI Source Extraction:'); console.log('OpenAI Source Extraction:');
console.log('- Sources found:', result.sources.length); console.log('- Sources found:', result.sources.length);
await saveResearchResult('source_extraction_nodejs', result);
if (result.sources.length > 0) { if (result.sources.length > 0) {
console.log('- Example source:', result.sources[0]); console.log('- Example source:', result.sources[0]);
expect(result.sources[0].url).toBeTruthy(); expect(result.sources[0].url).toBeTruthy();

View File

@@ -0,0 +1,95 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as qenv from '@push.rocks/qenv';
import * as smartfile from '@push.rocks/smartfile';
const testQenv = new qenv.Qenv('./', './.nogit/');
import * as smartai from '../ts/index.js';
let anthropicProvider: smartai.AnthropicProvider;
tap.test('Anthropic Vision: should create and start Anthropic provider', async () => {
anthropicProvider = new smartai.AnthropicProvider({
anthropicToken: await testQenv.getEnvVarOnDemand('ANTHROPIC_TOKEN'),
});
await anthropicProvider.start();
expect(anthropicProvider).toBeInstanceOf(smartai.AnthropicProvider);
});
tap.test('Anthropic Vision: should analyze coffee image with latte art', async () => {
// Test 1: Coffee image from Unsplash by Dani
const imagePath = './test/testimages/coffee-dani/coffee.jpg';
console.log(`Loading coffee image from: ${imagePath}`);
const imageBuffer = await smartfile.fs.toBuffer(imagePath);
console.log(`Image loaded, size: ${imageBuffer.length} bytes`);
const result = await anthropicProvider.vision({
image: imageBuffer,
prompt: 'Describe this coffee image. What do you see in terms of the cup, foam pattern, and overall composition?'
});
console.log(`Anthropic Vision (Coffee) - Result: ${result}`);
expect(result).toBeTruthy();
expect(typeof result).toEqual('string');
expect(result.toLowerCase()).toInclude('coffee');
// The image has a heart pattern in the latte art
const mentionsLatte = result.toLowerCase().includes('heart') ||
result.toLowerCase().includes('latte') ||
result.toLowerCase().includes('foam');
expect(mentionsLatte).toBeTrue();
});
tap.test('Anthropic Vision: should analyze laptop/workspace image', async () => {
// Test 2: Laptop image from Unsplash by Nicolas Bichon
const imagePath = './test/testimages/laptop-nicolas/laptop.jpg';
console.log(`Loading laptop image from: ${imagePath}`);
const imageBuffer = await smartfile.fs.toBuffer(imagePath);
console.log(`Image loaded, size: ${imageBuffer.length} bytes`);
const result = await anthropicProvider.vision({
image: imageBuffer,
prompt: 'Describe the technology and workspace setup in this image. What devices and equipment can you see?'
});
console.log(`Anthropic Vision (Laptop) - Result: ${result}`);
expect(result).toBeTruthy();
expect(typeof result).toEqual('string');
// Should mention laptop, computer, keyboard, or desk
const mentionsTech = result.toLowerCase().includes('laptop') ||
result.toLowerCase().includes('computer') ||
result.toLowerCase().includes('keyboard') ||
result.toLowerCase().includes('desk');
expect(mentionsTech).toBeTrue();
});
tap.test('Anthropic Vision: should analyze receipt/document image', async () => {
// Test 3: Receipt image from Unsplash by Annie Spratt
const imagePath = './test/testimages/receipt-annie/receipt.jpg';
console.log(`Loading receipt image from: ${imagePath}`);
const imageBuffer = await smartfile.fs.toBuffer(imagePath);
console.log(`Image loaded, size: ${imageBuffer.length} bytes`);
const result = await anthropicProvider.vision({
image: imageBuffer,
prompt: 'What type of document is this? Can you identify any text or numbers visible in the image?'
});
console.log(`Anthropic Vision (Receipt) - Result: ${result}`);
expect(result).toBeTruthy();
expect(typeof result).toEqual('string');
// Should mention receipt, document, text, or paper
const mentionsDocument = result.toLowerCase().includes('receipt') ||
result.toLowerCase().includes('document') ||
result.toLowerCase().includes('text') ||
result.toLowerCase().includes('paper');
expect(mentionsDocument).toBeTrue();
});
tap.test('Anthropic Vision: should stop the provider', async () => {
await anthropicProvider.stop();
});
export default tap.start();

View File

@@ -0,0 +1,36 @@
# Coffee Image Attribution
## coffee.jpg
**Photographer:** Dani (@frokz)
**Source URL:** https://unsplash.com/photos/cup-of-coffee-on-saucer-ZLqxSzvVr7I
**Direct Link:** https://images.unsplash.com/photo-1506372023823-741c83b836fe
### Metadata
- **Title:** Cup of coffee on saucer
- **Description:** One of many coffee-moments in my life ;)
- **Date Published:** September 25, 2017
- **Location:** Stockholm, Sweden
- **Tags:** coffee, cafe, heart, coffee cup, cup, barista, latte, mug, saucer, food, sweden, stockholm
### License
**Unsplash License** - Free to use
- ✅ Commercial and non-commercial use
- ✅ No permission needed
- ❌ Cannot be sold without significant modification
- ❌ Cannot be used to replicate Unsplash or similar service
Full license: https://unsplash.com/license
### Usage in This Project
This image is used for testing vision/image processing capabilities in the SmartAI library test suite, specifically for:
- Testing coffee/beverage recognition
- Latte art pattern detection (heart shape)
- Scene/environment analysis
- Multi-element image understanding (cup, saucer, table)
### Download Information
- **Downloaded:** September 28, 2025
- **Original Filename:** dani-ZLqxSzvVr7I-unsplash.jpg
- **Resolution:** High resolution (3.7 MB)
- **Format:** JPEG

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 MiB

View File

@@ -0,0 +1,40 @@
# Laptop Image Attribution
## laptop.jpg
**Photographer:** Nicolas Bichon (@nicol3a)
**Source URL:** https://unsplash.com/photos/a-laptop-computer-sitting-on-top-of-a-wooden-desk-ZhV4iqAXxyA
**Direct Link:** https://images.unsplash.com/photo-1704230972797-e0e3aba0fce7
### Metadata
- **Title:** A laptop computer sitting on top of a wooden desk
- **Description:** Lifestyle photo I took for my indie app Type, a macOS app to take notes without interrupting your flow. https://usetype.app.
- **Date Published:** January 2, 2024
- **Camera:** FUJIFILM, X-T20
- **Tags:** computer, laptop, mac, keyboard, computer keyboard, computer hardware, furniture, table, electronics, screen, monitor, hardware, display, tabletop, lcd screen, digital display
### Statistics
- **Views:** 183,020
- **Downloads:** 757
### License
**Unsplash License** - Free to use
- ✅ Commercial and non-commercial use
- ✅ No permission needed
- ❌ Cannot be sold without significant modification
- ❌ Cannot be used to replicate Unsplash or similar service
Full license: https://unsplash.com/license
### Usage in This Project
This image is used for testing vision/image processing capabilities in the SmartAI library test suite, specifically for:
- Testing technology/computer equipment recognition
- Workspace/office environment analysis
- Object detection (laptop, keyboard, monitor, table)
- Scene understanding and context analysis
### Download Information
- **Downloaded:** September 28, 2025
- **Original Filename:** nicolas-bichon-ZhV4iqAXxyA-unsplash.jpg
- **Resolution:** High resolution (1.8 MB)
- **Format:** JPEG

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

View File

@@ -0,0 +1,40 @@
# Receipt Image Attribution
## receipt.jpg
**Photographer:** Annie Spratt (@anniespratt)
**Source URL:** https://unsplash.com/photos/a-receipt-sitting-on-top-of-a-wooden-table-recgFWxDO1Y
**Direct Link:** https://images.unsplash.com/photo-1731686602391-7484df33a03c
### Metadata
- **Title:** A receipt sitting on top of a wooden table
- **Description:** Download this free HD photo of text, document, invoice, and receipt by Annie Spratt
- **Date Published:** November 15, 2024
- **Tags:** text, document, invoice, receipt, diaper
### Statistics
- **Views:** 54,593
- **Downloads:** 764
### License
**Unsplash License** - Free to use
- ✅ Commercial and non-commercial use
- ✅ No permission needed
- ❌ Cannot be sold without significant modification
- ❌ Cannot be used to replicate Unsplash or similar service
Full license: https://unsplash.com/license
### Usage in This Project
This image is used for testing vision/image processing capabilities in the SmartAI library test suite, specifically for:
- Testing text extraction and OCR capabilities
- Document recognition and classification
- Receipt/invoice analysis
- Text-heavy image understanding
- Structured data extraction from documents
### Download Information
- **Downloaded:** September 28, 2025
- **Original Filename:** annie-spratt-recgFWxDO1Y-unsplash.jpg
- **Resolution:** High resolution (3.3 MB)
- **Format:** JPEG

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.3 MiB

View File

@@ -3,6 +3,6 @@
*/ */
export const commitinfo = { export const commitinfo = {
name: '@push.rocks/smartai', name: '@push.rocks/smartai',
version: '0.6.1', version: '0.7.5',
description: 'SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.' description: 'SmartAi is a versatile TypeScript library designed to facilitate integration and interaction with various AI models, offering functionalities for chat, audio generation, document processing, and vision tasks.'
} }

View File

@@ -50,6 +50,60 @@ export interface ResearchResponse {
metadata?: any; metadata?: any;
} }
/**
* Options for image generation
*/
export interface ImageGenerateOptions {
prompt: string;
model?: 'gpt-image-1' | 'dall-e-3' | 'dall-e-2';
quality?: 'low' | 'medium' | 'high' | 'standard' | 'hd' | 'auto';
size?: '256x256' | '512x512' | '1024x1024' | '1536x1024' | '1024x1536' | '1792x1024' | '1024x1792' | 'auto';
style?: 'vivid' | 'natural';
background?: 'transparent' | 'opaque' | 'auto';
outputFormat?: 'png' | 'jpeg' | 'webp';
outputCompression?: number; // 0-100 for webp/jpeg
moderation?: 'low' | 'auto';
n?: number; // Number of images to generate
stream?: boolean;
partialImages?: number; // 0-3 for streaming
}
/**
* Options for image editing
*/
export interface ImageEditOptions {
image: Buffer;
prompt: string;
mask?: Buffer;
model?: 'gpt-image-1' | 'dall-e-2';
quality?: 'low' | 'medium' | 'high' | 'standard' | 'auto';
size?: '256x256' | '512x512' | '1024x1024' | '1536x1024' | '1024x1536' | 'auto';
background?: 'transparent' | 'opaque' | 'auto';
outputFormat?: 'png' | 'jpeg' | 'webp';
outputCompression?: number;
n?: number;
stream?: boolean;
partialImages?: number;
}
/**
* Response format for image operations
*/
export interface ImageResponse {
images: Array<{
b64_json?: string;
url?: string;
revisedPrompt?: string;
}>;
metadata?: {
model: string;
quality?: string;
size?: string;
outputFormat?: string;
tokensUsed?: number;
};
}
/** /**
* Abstract base class for multi-modal AI models. * Abstract base class for multi-modal AI models.
* Provides a common interface for different AI providers (OpenAI, Anthropic, Perplexity, Ollama) * Provides a common interface for different AI providers (OpenAI, Anthropic, Perplexity, Ollama)
@@ -131,4 +185,20 @@ export abstract class MultiModalModel {
* @throws Error if the provider doesn't support research capabilities * @throws Error if the provider doesn't support research capabilities
*/ */
public abstract research(optionsArg: ResearchOptions): Promise<ResearchResponse>; public abstract research(optionsArg: ResearchOptions): Promise<ResearchResponse>;
/**
* Image generation from text prompts
* @param optionsArg Options containing the prompt and generation parameters
* @returns Promise resolving to the generated image(s)
* @throws Error if the provider doesn't support image generation
*/
public abstract imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse>;
/**
* Image editing and inpainting
* @param optionsArg Options containing the image, prompt, and editing parameters
* @returns Promise resolving to the edited image(s)
* @throws Error if the provider doesn't support image editing
*/
public abstract imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse>;
} }

View File

@@ -96,6 +96,18 @@ export class Conversation {
return conversation; return conversation;
} }
public static async createWithElevenlabs(smartaiRefArg: SmartAi) {
if (!smartaiRefArg.elevenlabsProvider) {
throw new Error('ElevenLabs provider not available');
}
const conversation = new Conversation(smartaiRefArg, {
processFunction: async (input) => {
return '' // TODO implement proper streaming
}
});
return conversation;
}
// INSTANCE // INSTANCE
smartaiRef: SmartAi smartaiRef: SmartAi
private systemMessage: string; private systemMessage: string;

View File

@@ -1,6 +1,7 @@
import { Conversation } from './classes.conversation.js'; import { Conversation } from './classes.conversation.js';
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import { AnthropicProvider } from './provider.anthropic.js'; import { AnthropicProvider } from './provider.anthropic.js';
import { ElevenLabsProvider } from './provider.elevenlabs.js';
import { OllamaProvider } from './provider.ollama.js'; import { OllamaProvider } from './provider.ollama.js';
import { OpenAiProvider } from './provider.openai.js'; import { OpenAiProvider } from './provider.openai.js';
import { PerplexityProvider } from './provider.perplexity.js'; import { PerplexityProvider } from './provider.perplexity.js';
@@ -15,6 +16,7 @@ export interface ISmartAiOptions {
perplexityToken?: string; perplexityToken?: string;
groqToken?: string; groqToken?: string;
xaiToken?: string; xaiToken?: string;
elevenlabsToken?: string;
exo?: { exo?: {
baseUrl?: string; baseUrl?: string;
apiKey?: string; apiKey?: string;
@@ -24,9 +26,13 @@ export interface ISmartAiOptions {
model?: string; model?: string;
visionModel?: string; visionModel?: string;
}; };
elevenlabs?: {
defaultVoiceId?: string;
defaultModelId?: string;
};
} }
export type TProvider = 'openai' | 'anthropic' | 'perplexity' | 'ollama' | 'exo' | 'groq' | 'xai'; export type TProvider = 'openai' | 'anthropic' | 'perplexity' | 'ollama' | 'exo' | 'groq' | 'xai' | 'elevenlabs';
export class SmartAi { export class SmartAi {
public options: ISmartAiOptions; public options: ISmartAiOptions;
@@ -38,6 +44,7 @@ export class SmartAi {
public exoProvider: ExoProvider; public exoProvider: ExoProvider;
public groqProvider: GroqProvider; public groqProvider: GroqProvider;
public xaiProvider: XAIProvider; public xaiProvider: XAIProvider;
public elevenlabsProvider: ElevenLabsProvider;
constructor(optionsArg: ISmartAiOptions) { constructor(optionsArg: ISmartAiOptions) {
this.options = optionsArg; this.options = optionsArg;
@@ -74,6 +81,14 @@ export class SmartAi {
}); });
await this.xaiProvider.start(); await this.xaiProvider.start();
} }
if (this.options.elevenlabsToken) {
this.elevenlabsProvider = new ElevenLabsProvider({
elevenlabsToken: this.options.elevenlabsToken,
defaultVoiceId: this.options.elevenlabs?.defaultVoiceId,
defaultModelId: this.options.elevenlabs?.defaultModelId,
});
await this.elevenlabsProvider.start();
}
if (this.options.ollama) { if (this.options.ollama) {
this.ollamaProvider = new OllamaProvider({ this.ollamaProvider = new OllamaProvider({
baseUrl: this.options.ollama.baseUrl, baseUrl: this.options.ollama.baseUrl,
@@ -107,6 +122,9 @@ export class SmartAi {
if (this.xaiProvider) { if (this.xaiProvider) {
await this.xaiProvider.stop(); await this.xaiProvider.stop();
} }
if (this.elevenlabsProvider) {
await this.elevenlabsProvider.stop();
}
if (this.ollamaProvider) { if (this.ollamaProvider) {
await this.ollamaProvider.stop(); await this.ollamaProvider.stop();
} }
@@ -134,6 +152,8 @@ export class SmartAi {
return Conversation.createWithGroq(this); return Conversation.createWithGroq(this);
case 'xai': case 'xai':
return Conversation.createWithXai(this); return Conversation.createWithXai(this);
case 'elevenlabs':
return Conversation.createWithElevenlabs(this);
default: default:
throw new Error('Provider not available'); throw new Error('Provider not available');
} }

View File

@@ -7,3 +7,4 @@ export * from './provider.groq.js';
export * from './provider.ollama.js'; export * from './provider.ollama.js';
export * from './provider.xai.js'; export * from './provider.xai.js';
export * from './provider.exo.js'; export * from './provider.exo.js';
export * from './provider.elevenlabs.js';

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
import type { ImageBlockParam, TextBlockParam } from '@anthropic-ai/sdk/resources/messages'; import type { ImageBlockParam, TextBlockParam } from '@anthropic-ai/sdk/resources/messages';
type ContentBlock = ImageBlockParam | TextBlockParam; type ContentBlock = ImageBlockParam | TextBlockParam;
@@ -68,7 +77,7 @@ export class AnthropicProvider extends MultiModalModel {
// If we have a complete message, send it to Anthropic // If we have a complete message, send it to Anthropic
if (currentMessage) { if (currentMessage) {
const stream = await this.anthropicApiClient.messages.create({ const stream = await this.anthropicApiClient.messages.create({
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
messages: [{ role: currentMessage.role, content: currentMessage.content }], messages: [{ role: currentMessage.role, content: currentMessage.content }],
system: '', system: '',
stream: true, stream: true,
@@ -112,7 +121,7 @@ export class AnthropicProvider extends MultiModalModel {
})); }));
const result = await this.anthropicApiClient.messages.create({ const result = await this.anthropicApiClient.messages.create({
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
system: optionsArg.systemMessage, system: optionsArg.systemMessage,
messages: [ messages: [
...messages, ...messages,
@@ -159,7 +168,7 @@ export class AnthropicProvider extends MultiModalModel {
]; ];
const result = await this.anthropicApiClient.messages.create({ const result = await this.anthropicApiClient.messages.create({
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
messages: [{ messages: [{
role: 'user', role: 'user',
content content
@@ -211,14 +220,14 @@ export class AnthropicProvider extends MultiModalModel {
type: 'image', type: 'image',
source: { source: {
type: 'base64', type: 'base64',
media_type: 'image/jpeg', media_type: 'image/png',
data: Buffer.from(imageBytes).toString('base64') data: Buffer.from(imageBytes).toString('base64')
} }
}); });
} }
const result = await this.anthropicApiClient.messages.create({ const result = await this.anthropicApiClient.messages.create({
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
system: optionsArg.systemMessage, system: optionsArg.systemMessage,
messages: [ messages: [
...messages, ...messages,
@@ -251,23 +260,27 @@ export class AnthropicProvider extends MultiModalModel {
try { try {
// Build the tool configuration for web search // Build the tool configuration for web search
const tools = this.options.enableWebSearch ? [ const tools: any[] = [];
{
type: 'web_search_20250305' as const, if (this.options.enableWebSearch) {
name: 'web_search', const webSearchTool: any = {
description: 'Search the web for current information', type: 'web_search_20250305',
input_schema: { name: 'web_search'
type: 'object' as const, };
properties: {
query: { // Add optional parameters
type: 'string', if (optionsArg.maxSources) {
description: 'The search query' webSearchTool.max_uses = optionsArg.maxSources;
} }
},
required: ['query'] if (this.options.searchDomainAllowList?.length) {
webSearchTool.allowed_domains = this.options.searchDomainAllowList;
} else if (this.options.searchDomainBlockList?.length) {
webSearchTool.blocked_domains = this.options.searchDomainBlockList;
} }
tools.push(webSearchTool);
} }
] : [];
// Configure the request based on search depth // Configure the request based on search depth
const maxTokens = optionsArg.searchDepth === 'deep' ? 8192 : const maxTokens = optionsArg.searchDepth === 'deep' ? 8192 :
@@ -275,7 +288,7 @@ export class AnthropicProvider extends MultiModalModel {
// Create the research request // Create the research request
const requestParams: any = { const requestParams: any = {
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
system: systemMessage, system: systemMessage,
messages: [ messages: [
{ {
@@ -290,7 +303,6 @@ export class AnthropicProvider extends MultiModalModel {
// Add tools if web search is enabled // Add tools if web search is enabled
if (tools.length > 0) { if (tools.length > 0) {
requestParams.tools = tools; requestParams.tools = tools;
requestParams.tool_choice = { type: 'auto' };
} }
// Execute the research request // Execute the research request
@@ -304,11 +316,47 @@ export class AnthropicProvider extends MultiModalModel {
// Process content blocks // Process content blocks
for (const block of result.content) { for (const block of result.content) {
if ('text' in block) { if ('text' in block) {
// Accumulate text content
answer += block.text; answer += block.text;
// Extract citations if present
if ('citations' in block && Array.isArray(block.citations)) {
for (const citation of block.citations) {
if (citation.type === 'web_search_result_location') {
sources.push({
title: citation.title || '',
url: citation.url || '',
snippet: citation.cited_text || ''
});
}
}
}
} else if ('type' in block && block.type === 'server_tool_use') {
// Extract search queries from server tool use
if (block.name === 'web_search' && block.input && typeof block.input === 'object' && 'query' in block.input) {
searchQueries.push((block.input as any).query);
}
} else if ('type' in block && block.type === 'web_search_tool_result') {
// Extract sources from web search results
if (Array.isArray(block.content)) {
for (const result of block.content) {
if (result.type === 'web_search_result') {
// Only add if not already in sources (avoid duplicates from citations)
if (!sources.some(s => s.url === result.url)) {
sources.push({
title: result.title || '',
url: result.url || '',
snippet: '' // Search results don't include snippets, only citations do
});
}
}
}
}
} }
} }
// Parse sources from the answer (Claude includes citations in various formats) // Fallback: Parse markdown-style links if no citations found
if (sources.length === 0) {
const urlRegex = /\[([^\]]+)\]\(([^)]+)\)/g; const urlRegex = /\[([^\]]+)\]\(([^)]+)\)/g;
let match: RegExpExecArray | null; let match: RegExpExecArray | null;
@@ -319,39 +367,20 @@ export class AnthropicProvider extends MultiModalModel {
snippet: '' snippet: ''
}); });
} }
// Also look for plain URLs
const plainUrlRegex = /https?:\/\/[^\s\)]+/g;
const plainUrls = answer.match(plainUrlRegex) || [];
for (const url of plainUrls) {
// Check if this URL is already in sources
if (!sources.some(s => s.url === url)) {
sources.push({
title: new URL(url).hostname,
url: url,
snippet: ''
});
}
} }
// Extract tool use information if available // Check if web search was used based on usage info
if ('tool_use' in result && Array.isArray(result.tool_use)) { const webSearchCount = result.usage?.server_tool_use?.web_search_requests || 0;
for (const toolUse of result.tool_use) {
if (toolUse.name === 'web_search' && toolUse.input?.query) {
searchQueries.push(toolUse.input.query);
}
}
}
return { return {
answer, answer,
sources, sources,
searchQueries: searchQueries.length > 0 ? searchQueries : undefined, searchQueries: searchQueries.length > 0 ? searchQueries : undefined,
metadata: { metadata: {
model: 'claude-3-opus-20240229', model: 'claude-sonnet-4-5-20250929',
searchDepth: optionsArg.searchDepth || 'basic', searchDepth: optionsArg.searchDepth || 'basic',
tokensUsed: result.usage?.output_tokens tokensUsed: result.usage?.output_tokens,
webSearchesPerformed: webSearchCount
} }
}; };
} catch (error) { } catch (error) {
@@ -359,4 +388,18 @@ export class AnthropicProvider extends MultiModalModel {
throw new Error(`Failed to perform research: ${error.message}`); throw new Error(`Failed to perform research: ${error.message}`);
} }
} }
/**
* Image generation is not supported by Anthropic
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by Anthropic. Claude can only analyze images, not generate them. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by Anthropic
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by Anthropic. Claude can only analyze images, not edit them. Please use OpenAI provider for image editing.');
}
} }

117
ts/provider.elevenlabs.ts Normal file
View File

@@ -0,0 +1,117 @@
import * as plugins from './plugins.js';
import { MultiModalModel } from './abstract.classes.multimodal.js';
import type {
ChatOptions,
ChatResponse,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
export interface IElevenLabsProviderOptions {
elevenlabsToken: string;
defaultVoiceId?: string;
defaultModelId?: string;
}
export interface IElevenLabsVoiceSettings {
stability?: number;
similarity_boost?: number;
style?: number;
use_speaker_boost?: boolean;
}
export class ElevenLabsProvider extends MultiModalModel {
private options: IElevenLabsProviderOptions;
private baseUrl: string = 'https://api.elevenlabs.io/v1';
constructor(optionsArg: IElevenLabsProviderOptions) {
super();
this.options = optionsArg;
}
public async start() {
await super.start();
}
public async stop() {
await super.stop();
}
public async chat(optionsArg: ChatOptions): Promise<ChatResponse> {
throw new Error('ElevenLabs does not support chat functionality. This provider is specialized for text-to-speech only.');
}
public async chatStream(input: ReadableStream<Uint8Array>): Promise<ReadableStream<string>> {
throw new Error('ElevenLabs does not support chat streaming functionality. This provider is specialized for text-to-speech only.');
}
public async audio(optionsArg: {
message: string;
voiceId?: string;
modelId?: string;
voiceSettings?: IElevenLabsVoiceSettings;
}): Promise<NodeJS.ReadableStream> {
const voiceId = optionsArg.voiceId || this.options.defaultVoiceId;
if (!voiceId) {
throw new Error('Voice ID is required for ElevenLabs TTS. Please provide voiceId in the method call or set defaultVoiceId in provider options.');
}
const modelId = optionsArg.modelId || this.options.defaultModelId || 'eleven_v3';
const url = `${this.baseUrl}/text-to-speech/${voiceId}`;
const requestBody: any = {
text: optionsArg.message,
model_id: modelId,
};
if (optionsArg.voiceSettings) {
requestBody.voice_settings = optionsArg.voiceSettings;
}
const response = await plugins.smartrequest.SmartRequest.create()
.url(url)
.header('xi-api-key', this.options.elevenlabsToken)
.json(requestBody)
.autoDrain(false)
.post();
if (!response.ok) {
const errorText = await response.text();
throw new Error(`ElevenLabs API error: ${response.status} ${response.statusText} - ${errorText}`);
}
const nodeStream = response.streamNode();
return nodeStream;
}
public async vision(optionsArg: { image: Buffer; prompt: string }): Promise<string> {
throw new Error('ElevenLabs does not support vision functionality. This provider is specialized for text-to-speech only.');
}
public async document(optionsArg: {
systemMessage: string;
userMessage: string;
pdfDocuments: Uint8Array[];
messageHistory: any[];
}): Promise<{ message: any }> {
throw new Error('ElevenLabs does not support document processing. This provider is specialized for text-to-speech only.');
}
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
throw new Error('ElevenLabs does not support research capabilities. This provider is specialized for text-to-speech only.');
}
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('ElevenLabs does not support image generation. This provider is specialized for text-to-speech only.');
}
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('ElevenLabs does not support image editing. This provider is specialized for text-to-speech only.');
}
}

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions'; import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions';
export interface IExoProviderOptions { export interface IExoProviderOptions {
@@ -129,4 +138,18 @@ export class ExoProvider extends MultiModalModel {
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> { public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
throw new Error('Research capabilities are not yet supported by Exo provider.'); throw new Error('Research capabilities are not yet supported by Exo provider.');
} }
/**
* Image generation is not supported by Exo
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by Exo. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by Exo
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by Exo. Please use OpenAI provider for image editing.');
}
} }

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
export interface IGroqProviderOptions { export interface IGroqProviderOptions {
groqToken: string; groqToken: string;
@@ -193,4 +202,18 @@ export class GroqProvider extends MultiModalModel {
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> { public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
throw new Error('Research capabilities are not yet supported by Groq provider.'); throw new Error('Research capabilities are not yet supported by Groq provider.');
} }
/**
* Image generation is not supported by Groq
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by Groq. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by Groq
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by Groq. Please use OpenAI provider for image editing.');
}
} }

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
export interface IOllamaProviderOptions { export interface IOllamaProviderOptions {
baseUrl?: string; baseUrl?: string;
@@ -255,4 +264,18 @@ export class OllamaProvider extends MultiModalModel {
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> { public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
throw new Error('Research capabilities are not yet supported by Ollama provider.'); throw new Error('Research capabilities are not yet supported by Ollama provider.');
} }
/**
* Image generation is not supported by Ollama
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by Ollama. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by Ollama
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by Ollama. Please use OpenAI provider for image editing.');
}
} }

View File

@@ -9,7 +9,13 @@ export type TChatCompletionRequestMessage = {
}; };
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
export interface IOpenaiProviderOptions { export interface IOpenaiProviderOptions {
openaiToken: string; openaiToken: string;
@@ -17,6 +23,7 @@ export interface IOpenaiProviderOptions {
audioModel?: string; audioModel?: string;
visionModel?: string; visionModel?: string;
researchModel?: string; researchModel?: string;
imageModel?: string;
enableWebSearch?: boolean; enableWebSearch?: boolean;
} }
@@ -233,52 +240,37 @@ export class OpenAiProvider extends MultiModalModel {
} }
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> { public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
// Determine which model to use based on search depth // Determine which model to use - Deep Research API requires specific models
let model: string; let model: string;
if (optionsArg.searchDepth === 'deep') { if (optionsArg.searchDepth === 'deep') {
model = this.options.researchModel || 'o4-mini-deep-research-2025-06-26'; model = this.options.researchModel || 'o4-mini-deep-research-2025-06-26';
} else {
// For basic/advanced, still use deep research models if web search is needed
if (optionsArg.includeWebSearch) {
model = this.options.researchModel || 'o4-mini-deep-research-2025-06-26';
} else { } else {
model = this.options.chatModel || 'gpt-5-mini'; model = this.options.chatModel || 'gpt-5-mini';
} }
}
// Prepare the request parameters const systemMessage = 'You are a research assistant. Provide comprehensive answers with citations and sources when available.';
// Prepare request parameters using Deep Research API format
const requestParams: any = { const requestParams: any = {
model, model,
messages: [ instructions: systemMessage,
{ input: optionsArg.query
role: 'system',
content: 'You are a research assistant. Provide comprehensive answers with citations and sources when available.'
},
{
role: 'user',
content: optionsArg.query
}
],
temperature: 0.7
}; };
// Add web search tools if requested // Add web search tool if requested
if (optionsArg.includeWebSearch || optionsArg.searchDepth === 'deep') { if (optionsArg.includeWebSearch || optionsArg.searchDepth === 'deep') {
requestParams.tools = [ requestParams.tools = [
{ {
type: 'function', type: 'web_search_preview',
function: { search_context_size: optionsArg.searchDepth === 'deep' ? 'high' :
name: 'web_search', optionsArg.searchDepth === 'advanced' ? 'medium' : 'low'
description: 'Search the web for information',
parameters: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'The search query'
}
},
required: ['query']
}
}
} }
]; ];
requestParams.tool_choice = 'auto';
} }
// Add background flag for deep research // Add background flag for deep research
@@ -287,14 +279,36 @@ export class OpenAiProvider extends MultiModalModel {
} }
try { try {
// Execute the research request // Execute the research request using Deep Research API
const result = await this.openAiApiClient.chat.completions.create(requestParams); const result = await this.openAiApiClient.responses.create(requestParams);
// Extract the answer // Extract the answer from output items
const answer = result.choices[0].message.content || ''; let answer = '';
// Parse sources from the response (OpenAI often includes URLs in markdown format)
const sources: Array<{ url: string; title: string; snippet: string }> = []; const sources: Array<{ url: string; title: string; snippet: string }> = [];
const searchQueries: string[] = [];
// Process output items
for (const item of result.output || []) {
// Extract message content
if (item.type === 'message' && 'content' in item) {
const messageItem = item as any;
for (const contentItem of messageItem.content || []) {
if (contentItem.type === 'output_text' && 'text' in contentItem) {
answer += contentItem.text;
}
}
}
// Extract web search queries
if (item.type === 'web_search_call' && 'action' in item) {
const searchItem = item as any;
if (searchItem.action && searchItem.action.type === 'search' && 'query' in searchItem.action) {
searchQueries.push(searchItem.action.query);
}
}
}
// Parse sources from markdown links in the answer
const urlRegex = /\[([^\]]+)\]\(([^)]+)\)/g; const urlRegex = /\[([^\]]+)\]\(([^)]+)\)/g;
let match: RegExpExecArray | null; let match: RegExpExecArray | null;
@@ -302,27 +316,10 @@ export class OpenAiProvider extends MultiModalModel {
sources.push({ sources.push({
title: match[1], title: match[1],
url: match[2], url: match[2],
snippet: '' // OpenAI doesn't provide snippets in standard responses snippet: ''
}); });
} }
// Extract search queries if tools were used
const searchQueries: string[] = [];
if (result.choices[0].message.tool_calls) {
for (const toolCall of result.choices[0].message.tool_calls) {
if ('function' in toolCall && toolCall.function.name === 'web_search') {
try {
const args = JSON.parse(toolCall.function.arguments);
if (args.query) {
searchQueries.push(args.query);
}
} catch (e) {
// Ignore parsing errors
}
}
}
}
return { return {
answer, answer,
sources, sources,
@@ -338,4 +335,121 @@ export class OpenAiProvider extends MultiModalModel {
throw new Error(`Failed to perform research: ${error.message}`); throw new Error(`Failed to perform research: ${error.message}`);
} }
} }
/**
* Image generation using OpenAI's gpt-image-1 or DALL-E models
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
const model = optionsArg.model || this.options.imageModel || 'gpt-image-1';
try {
const requestParams: any = {
model,
prompt: optionsArg.prompt,
n: optionsArg.n || 1,
};
// Add gpt-image-1 specific parameters
if (model === 'gpt-image-1') {
if (optionsArg.quality) requestParams.quality = optionsArg.quality;
if (optionsArg.size) requestParams.size = optionsArg.size;
if (optionsArg.background) requestParams.background = optionsArg.background;
if (optionsArg.outputFormat) requestParams.output_format = optionsArg.outputFormat;
if (optionsArg.outputCompression !== undefined) requestParams.output_compression = optionsArg.outputCompression;
if (optionsArg.moderation) requestParams.moderation = optionsArg.moderation;
if (optionsArg.stream !== undefined) requestParams.stream = optionsArg.stream;
if (optionsArg.partialImages !== undefined) requestParams.partial_images = optionsArg.partialImages;
} else if (model === 'dall-e-3') {
// DALL-E 3 specific parameters
if (optionsArg.quality) requestParams.quality = optionsArg.quality;
if (optionsArg.size) requestParams.size = optionsArg.size;
if (optionsArg.style) requestParams.style = optionsArg.style;
requestParams.response_format = 'b64_json'; // Always use base64 for consistency
} else if (model === 'dall-e-2') {
// DALL-E 2 specific parameters
if (optionsArg.size) requestParams.size = optionsArg.size;
requestParams.response_format = 'b64_json';
}
const result = await this.openAiApiClient.images.generate(requestParams);
const images = (result.data || []).map(img => ({
b64_json: img.b64_json,
url: img.url,
revisedPrompt: img.revised_prompt
}));
return {
images,
metadata: {
model,
quality: result.quality,
size: result.size,
outputFormat: result.output_format,
tokensUsed: result.usage?.total_tokens
}
};
} catch (error) {
console.error('Image generation error:', error);
throw new Error(`Failed to generate image: ${error.message}`);
}
}
/**
* Image editing using OpenAI's gpt-image-1 or DALL-E 2 models
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
const model = optionsArg.model || this.options.imageModel || 'gpt-image-1';
try {
const requestParams: any = {
model,
image: optionsArg.image,
prompt: optionsArg.prompt,
n: optionsArg.n || 1,
};
// Add mask if provided
if (optionsArg.mask) {
requestParams.mask = optionsArg.mask;
}
// Add gpt-image-1 specific parameters
if (model === 'gpt-image-1') {
if (optionsArg.quality) requestParams.quality = optionsArg.quality;
if (optionsArg.size) requestParams.size = optionsArg.size;
if (optionsArg.background) requestParams.background = optionsArg.background;
if (optionsArg.outputFormat) requestParams.output_format = optionsArg.outputFormat;
if (optionsArg.outputCompression !== undefined) requestParams.output_compression = optionsArg.outputCompression;
if (optionsArg.stream !== undefined) requestParams.stream = optionsArg.stream;
if (optionsArg.partialImages !== undefined) requestParams.partial_images = optionsArg.partialImages;
} else if (model === 'dall-e-2') {
// DALL-E 2 specific parameters
if (optionsArg.size) requestParams.size = optionsArg.size;
requestParams.response_format = 'b64_json';
}
const result = await this.openAiApiClient.images.edit(requestParams);
const images = (result.data || []).map(img => ({
b64_json: img.b64_json,
url: img.url,
revisedPrompt: img.revised_prompt
}));
return {
images,
metadata: {
model,
quality: result.quality,
size: result.size,
outputFormat: result.output_format,
tokensUsed: result.usage?.total_tokens
}
};
} catch (error) {
console.error('Image edit error:', error);
throw new Error(`Failed to edit image: ${error.message}`);
}
}
} }

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
export interface IPerplexityProviderOptions { export interface IPerplexityProviderOptions {
perplexityToken: string; perplexityToken: string;
@@ -233,4 +242,18 @@ export class PerplexityProvider extends MultiModalModel {
throw new Error(`Failed to perform research: ${error.message}`); throw new Error(`Failed to perform research: ${error.message}`);
} }
} }
/**
* Image generation is not supported by Perplexity
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by Perplexity. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by Perplexity
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by Perplexity. Please use OpenAI provider for image editing.');
}
} }

View File

@@ -1,7 +1,16 @@
import * as plugins from './plugins.js'; import * as plugins from './plugins.js';
import * as paths from './paths.js'; import * as paths from './paths.js';
import { MultiModalModel } from './abstract.classes.multimodal.js'; import { MultiModalModel } from './abstract.classes.multimodal.js';
import type { ChatOptions, ChatResponse, ChatMessage, ResearchOptions, ResearchResponse } from './abstract.classes.multimodal.js'; import type {
ChatOptions,
ChatResponse,
ChatMessage,
ResearchOptions,
ResearchResponse,
ImageGenerateOptions,
ImageEditOptions,
ImageResponse
} from './abstract.classes.multimodal.js';
import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions'; import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions';
export interface IXAIProviderOptions { export interface IXAIProviderOptions {
@@ -185,4 +194,18 @@ export class XAIProvider extends MultiModalModel {
public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> { public async research(optionsArg: ResearchOptions): Promise<ResearchResponse> {
throw new Error('Research capabilities are not yet supported by xAI provider.'); throw new Error('Research capabilities are not yet supported by xAI provider.');
} }
/**
* Image generation is not supported by xAI
*/
public async imageGenerate(optionsArg: ImageGenerateOptions): Promise<ImageResponse> {
throw new Error('Image generation is not supported by xAI. Please use OpenAI provider for image generation.');
}
/**
* Image editing is not supported by xAI
*/
public async imageEdit(optionsArg: ImageEditOptions): Promise<ImageResponse> {
throw new Error('Image editing is not supported by xAI. Please use OpenAI provider for image editing.');
}
} }