fix(docs): update documentation: clarify provider capabilities, add provider capabilities summary, polish examples and formatting, and remove Serena project config

This commit is contained in:
2026-01-20 01:27:52 +00:00
parent ae8d3ccf33
commit 2040b3c629
6 changed files with 169 additions and 408 deletions

282
readme.md
View File

@@ -6,7 +6,7 @@
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue.svg)](https://www.typescriptlang.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
SmartAI unifies the world's leading AI providers - OpenAI, Anthropic, Mistral, Perplexity, Ollama, Groq, XAI, Exo, and ElevenLabs - under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in.
SmartAI unifies the world's leading AI providers OpenAI, Anthropic, Mistral, Perplexity, Ollama, Groq, XAI, Exo, and ElevenLabs under a single, elegant TypeScript interface. Build AI applications at lightning speed without vendor lock-in.
## Issue Reporting and Security
@@ -14,19 +14,23 @@ For reporting bugs, issues, or security vulnerabilities, please visit [community
## 🎯 Why SmartAI?
- **🔌 Universal Interface**: Write once, run with any AI provider. Switch between GPT-4, Claude, Llama, or Grok with a single line change.
- **🛡️ Type-Safe**: Full TypeScript support with comprehensive type definitions for all operations
- **🌊 Streaming First**: Built for real-time applications with native streaming support
- **🎨 Multi-Modal**: Seamlessly work with text, images, audio, and documents
- **🏠 Local & Cloud**: Support for both cloud providers and local models via Ollama
- **⚡ Zero Lock-In**: Your code remains portable across all AI providers
- **🔌 Universal Interface**: Write once, run with any AI provider. Switch between GPT-5, Claude, Llama, or Grok with a single line change.
- **🛡️ Type-Safe**: Full TypeScript support with comprehensive type definitions for all operations.
- **🌊 Streaming First**: Built for real-time applications with native streaming support.
- **🎨 Multi-Modal**: Seamlessly work with text, images, audio, and documents.
- **🏠 Local & Cloud**: Support for both cloud providers and local models via Ollama/Exo.
- **⚡ Zero Lock-In**: Your code remains portable across all AI providers.
## 🚀 Quick Start
## 📦 Installation
```bash
npm install @push.rocks/smartai
# or
pnpm install @push.rocks/smartai
```
## 🚀 Quick Start
```typescript
import { SmartAi } from '@push.rocks/smartai';
@@ -48,6 +52,8 @@ const response = await ai.openaiProvider.chat({
userMessage: 'Explain quantum computing in simple terms',
messageHistory: [],
});
console.log(response.message);
```
## 📊 Provider Capabilities Matrix
@@ -56,15 +62,15 @@ Choose the right provider for your use case:
| Provider | Chat | Streaming | TTS | Vision | Documents | Research | Images | Highlights |
| -------------- | :--: | :-------: | :-: | :----: | :-------: | :------: | :----: | --------------------------------------------------------------- |
| **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | gpt-image-1<br>• DALL-E 3<br>• Deep research API |
| **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | Claude Sonnet 4.5<br>• Superior reasoning<br>• Web search API |
| **Mistral** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | Native PDF OCR<br>• mistral-large<br>• Fast inference |
| **ElevenLabs** | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | Premium TTS<br>• 70+ languages<br>• Natural voices |
| **Ollama** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | 100% local<br>• Privacy-first<br>• No API costs |
| **XAI** | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | Grok models<br>• Real-time data<br>• Uncensored |
| **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | Web-aware<br>• Research-focused<br>• Sonar Pro models |
| **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 10x faster<br>• LPU inference<br>• Low latency |
| **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | Distributed<br>• P2P compute<br>• Decentralized |
| **OpenAI** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | gpt-image-1 • DALL-E 3 • Deep Research API |
| **Anthropic** | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | Claude Sonnet 4.5 • Extended Thinking • Web Search API |
| **Mistral** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | Native PDF OCR • mistral-large • Fast inference |
| **ElevenLabs** | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | Premium TTS • 70+ languages • v3 model |
| **Ollama** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | 100% local • Privacy-first • No API costs |
| **XAI** | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | Grok 2 • Real-time data |
| **Perplexity** | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | Web-aware • Research-focused • Sonar Pro |
| **Groq** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 10x faster • LPU inference • Llama 3.3 |
| **Exo** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | Distributed • P2P compute • Decentralized |
## 🎮 Core Features
@@ -73,9 +79,9 @@ Choose the right provider for your use case:
Works identically across all providers:
```typescript
// Use GPT-4 for complex reasoning
// Use GPT-5 for complex reasoning
const gptResponse = await ai.openaiProvider.chat({
systemMessage: 'You are a expert physicist.',
systemMessage: 'You are an expert physicist.',
userMessage: 'Explain the implications of quantum entanglement',
messageHistory: [],
});
@@ -128,20 +134,17 @@ const audioStream = await ai.openaiProvider.audio({
const elevenLabsAudio = await ai.elevenlabsProvider.audio({
message: 'Experience the most lifelike text to speech technology.',
voiceId: '19STyYD15bswVz51nqLf', // Optional: Samara voice
modelId: 'eleven_v3', // Optional: defaults to eleven_v3 (70+ languages, most expressive)
modelId: 'eleven_v3', // Optional: defaults to eleven_v3 (70+ languages)
voiceSettings: {
// Optional: fine-tune voice characteristics
stability: 0.5, // 0-1: Speech consistency
similarity_boost: 0.8, // 0-1: Voice similarity to original
style: 0.0, // 0-1: Expressiveness (higher = more expressive)
style: 0.0, // 0-1: Expressiveness
use_speaker_boost: true, // Enhanced clarity
},
});
// Stream directly to speakers
audioStream.pipe(speakerOutput);
// Or save to file
// Stream directly to speakers or save to file
audioStream.pipe(fs.createWriteStream('welcome.mp3'));
```
@@ -158,7 +161,7 @@ const gptVision = await ai.openaiProvider.vision({
prompt: 'Describe this product and suggest marketing angles',
});
// Anthropic: Detailed analysis
// Anthropic: Detailed analysis with extended thinking
const claudeVision = await ai.anthropicProvider.vision({
image,
prompt: 'Identify any safety concerns or defects',
@@ -179,7 +182,7 @@ Extract insights from PDFs with AI:
const contract = fs.readFileSync('contract.pdf');
const invoice = fs.readFileSync('invoice.pdf');
// Analyze documents
// Analyze documents with OpenAI
const analysis = await ai.openaiProvider.document({
systemMessage: 'You are a legal expert.',
userMessage: 'Compare these documents and highlight key differences',
@@ -187,7 +190,7 @@ const analysis = await ai.openaiProvider.document({
pdfDocuments: [contract, invoice],
});
// Multi-document analysis
// Multi-document analysis with Anthropic
const taxDocs = [form1099, w2, receipts];
const taxAnalysis = await ai.anthropicProvider.document({
systemMessage: 'You are a tax advisor.',
@@ -213,6 +216,8 @@ console.log(deepResearch.answer);
console.log('Sources:', deepResearch.sources);
// Anthropic Web Search - Domain-filtered research
import { AnthropicProvider } from '@push.rocks/smartai';
const anthropic = new AnthropicProvider({
anthropicToken: 'sk-ant-...',
enableWebSearch: true,
@@ -233,14 +238,14 @@ const perplexityResearch = await ai.perplexityProvider.research({
**Research Options:**
- `searchDepth`: 'basic' | 'advanced' | 'deep'
- `searchDepth`: `'basic'` | `'advanced'` | `'deep'`
- `maxSources`: Number of sources to include
- `includeWebSearch`: Enable web search (OpenAI)
- `background`: Run as background task (OpenAI)
**Supported Providers:**
- **OpenAI**: Deep Research API with specialized models (`o3-deep-research-2025-06-26`, `o4-mini-deep-research-2025-06-26`)
- **OpenAI**: Deep Research API with specialized models (`o3-deep-research-*`, `o4-mini-deep-research-*`)
- **Anthropic**: Web Search API with domain filtering
- **Perplexity**: Sonar and Sonar Pro models with built-in citations
@@ -269,12 +274,12 @@ const response = await anthropic.chat({
**Thinking Modes:**
| Mode | Budget Tokens | Use Case |
|------|---------------|----------|
| `'quick'` | 2,048 | Lightweight reasoning for simple queries |
| `'normal'` | 8,000 | **Default** - Balanced reasoning for most tasks |
| `'deep'` | 16,000 | Complex reasoning for difficult problems |
| `'off'` | 0 | Disable extended thinking |
| Mode | Budget Tokens | Use Case |
| ---------- | ------------- | ------------------------------------------------ |
| `'quick'` | 2,048 | Lightweight reasoning for simple queries |
| `'normal'` | 8,000 | **Default** Balanced reasoning for most tasks |
| `'deep'` | 16,000 | Complex reasoning for difficult problems |
| `'off'` | 0 | Disable extended thinking |
**Best Practices:**
@@ -285,16 +290,16 @@ const response = await anthropic.chat({
### 📑 Native PDF OCR (Mistral)
Mistral provides native PDF document processing via their OCR API - no image conversion required:
Mistral provides native PDF document processing via their OCR API no image conversion required:
```typescript
import { MistralProvider } from '@push.rocks/smartai';
const mistral = new MistralProvider({
mistralToken: 'your-api-key',
chatModel: 'mistral-large-latest', // Default
ocrModel: 'mistral-ocr-latest', // Default
tableFormat: 'markdown', // 'markdown' | 'html'
chatModel: 'mistral-large-latest', // Default
ocrModel: 'mistral-ocr-latest', // Default
tableFormat: 'markdown', // 'markdown' | 'html'
});
await mistral.start();
@@ -311,6 +316,7 @@ const result = await mistral.document({
**Key Advantage**: Unlike other providers that convert PDFs to images first, Mistral's OCR API processes PDFs natively, potentially offering faster and more accurate text extraction for document-heavy workloads.
**Supported Formats:**
- Native PDF processing via Files API
- Image OCR (JPEG, PNG, GIF, WebP) for vision tasks
- Table extraction with markdown or HTML output
@@ -381,14 +387,14 @@ const editedImage = await ai.openaiProvider.imageEdit({
**Image Generation Options:**
- `model`: 'gpt-image-1' | 'dall-e-3' | 'dall-e-2'
- `quality`: 'low' | 'medium' | 'high' | 'auto'
- `model`: `'gpt-image-1'` | `'dall-e-3'` | `'dall-e-2'`
- `quality`: `'low'` | `'medium'` | `'high'` | `'auto'`
- `size`: Multiple aspect ratios up to 4096×4096
- `background`: 'transparent' | 'opaque' | 'auto'
- `outputFormat`: 'png' | 'jpeg' | 'webp'
- `outputCompression`: 0-100 for webp/jpeg
- `moderation`: 'low' | 'auto'
- `n`: Number of images (1-10)
- `background`: `'transparent'` | `'opaque'` | `'auto'`
- `outputFormat`: `'png'` | `'jpeg'` | `'webp'`
- `outputCompression`: 0100 for webp/jpeg
- `moderation`: `'low'` | `'auto'`
- `n`: Number of images (110)
**gpt-image-1 Advantages:**
@@ -424,7 +430,7 @@ await inputWriter.write('Now show me how to make it thread-safe');
```typescript
const supportBot = new SmartAi({
anthropicToken: process.env.ANTHROPIC_KEY // Claude for empathetic responses
anthropicToken: process.env.ANTHROPIC_KEY, // Claude for empathetic responses
});
async function handleCustomerQuery(query: string, history: ChatMessage[]) {
@@ -433,13 +439,13 @@ async function handleCustomerQuery(query: string, history: ChatMessage[]) {
systemMessage: `You are a helpful customer support agent.
Be empathetic, professional, and solution-oriented.`,
userMessage: query,
messageHistory: history
messageHistory: history,
});
return response.message;
} catch (error) {
// Fallback to another provider if needed
return await supportBot.openaiProvider.chat({...});
return await supportBot.openaiProvider.chat({ /* ... */ });
}
}
```
@@ -452,19 +458,16 @@ const codeReviewer = new SmartAi({
});
async function reviewCode(code: string, language: string) {
const startTime = Date.now();
const review = await codeReviewer.groqProvider.chat({
systemMessage: `You are a ${language} expert. Review code for:
- Security vulnerabilities
- Performance issues
- Performance issues
- Best practices
- Potential bugs`,
userMessage: `Review this code:\n\n${code}`,
messageHistory: [],
});
console.log(`Review completed in ${Date.now() - startTime}ms`);
return review.message;
}
```
@@ -478,14 +481,15 @@ const researcher = new SmartAi({
async function research(topic: string) {
// Perplexity excels at web-aware research
const findings = await researcher.perplexityProvider.chat({
systemMessage:
'You are a research assistant. Provide factual, cited information.',
userMessage: `Research the latest developments in ${topic}`,
messageHistory: [],
const findings = await researcher.perplexityProvider.research({
query: `Research the latest developments in ${topic}`,
searchDepth: 'deep',
});
return findings.message;
return {
answer: findings.answer,
sources: findings.sources,
};
}
```
@@ -522,23 +526,26 @@ async function analyzeSensitiveDoc(pdfBuffer: Buffer) {
class SmartAIRouter {
constructor(private ai: SmartAi) {}
async query(message: string, requirements: {
speed?: boolean;
accuracy?: boolean;
cost?: boolean;
privacy?: boolean;
}) {
async query(
message: string,
requirements: {
speed?: boolean;
accuracy?: boolean;
cost?: boolean;
privacy?: boolean;
}
) {
if (requirements.privacy) {
return this.ai.ollamaProvider.chat({...}); // Local only
return this.ai.ollamaProvider.chat({ /* ... */ }); // Local only
}
if (requirements.speed) {
return this.ai.groqProvider.chat({...}); // 10x faster
return this.ai.groqProvider.chat({ /* ... */ }); // 10x faster
}
if (requirements.accuracy) {
return this.ai.anthropicProvider.chat({...}); // Best reasoning
return this.ai.anthropicProvider.chat({ /* ... */ }); // Best reasoning
}
// Default fallback
return this.ai.openaiProvider.chat({...});
return this.ai.openaiProvider.chat({ /* ... */ });
}
}
```
@@ -549,7 +556,7 @@ class SmartAIRouter {
// Don't wait for the entire response
async function streamResponse(userQuery: string) {
const stream = await ai.openaiProvider.chatStream(
createInputStream(userQuery),
createInputStream(userQuery)
);
// Process tokens as they arrive
@@ -566,9 +573,9 @@ async function streamResponse(userQuery: string) {
// Get the best answer from multiple AIs
async function consensusQuery(question: string) {
const providers = [
ai.openaiProvider.chat({...}),
ai.anthropicProvider.chat({...}),
ai.perplexityProvider.chat({...})
ai.openaiProvider.chat({ /* ... */ }),
ai.anthropicProvider.chat({ /* ... */ }),
ai.perplexityProvider.chat({ /* ... */ }),
];
const responses = await Promise.all(providers);
@@ -576,21 +583,61 @@ async function consensusQuery(question: string) {
}
```
## 🛠️ Advanced Features
## 🛠️ Advanced Configuration
### Custom Streaming Transformations
### Provider-Specific Options
```typescript
// Add real-time translation
const translationStream = new TransformStream({
async transform(chunk, controller) {
const translated = await translateChunk(chunk);
controller.enqueue(translated);
const ai = new SmartAi({
// OpenAI
openaiToken: 'sk-...',
// Anthropic with extended thinking
anthropicToken: 'sk-ant-...',
// Perplexity for research
perplexityToken: 'pplx-...',
// Groq for speed
groqToken: 'gsk_...',
// Mistral with OCR settings
mistralToken: 'your-key',
mistral: {
chatModel: 'mistral-large-latest',
ocrModel: 'mistral-ocr-latest',
tableFormat: 'markdown',
},
// XAI (Grok)
xaiToken: 'xai-...',
// ElevenLabs TTS
elevenlabsToken: 'sk-...',
elevenlabs: {
defaultVoiceId: '19STyYD15bswVz51nqLf',
defaultModelId: 'eleven_v3',
},
// Ollama (local)
ollama: {
baseUrl: 'http://localhost:11434',
model: 'llama2',
visionModel: 'llava',
defaultOptions: {
num_ctx: 4096,
temperature: 0.7,
top_p: 0.9,
},
defaultTimeout: 120000,
},
// Exo (distributed)
exo: {
baseUrl: 'http://localhost:8080/v1',
apiKey: 'optional-key',
},
});
const responseStream = await ai.openaiProvider.chatStream(input);
const translatedStream = responseStream.pipeThrough(translationStream);
```
### Error Handling & Fallbacks
@@ -613,84 +660,27 @@ class ResilientAI {
}
```
### Token Counting & Cost Management
```typescript
// Track usage across providers
class UsageTracker {
async trackedChat(provider: string, options: ChatOptions) {
const start = Date.now();
const response = await ai[`${provider}Provider`].chat(options);
const usage = {
provider,
duration: Date.now() - start,
inputTokens: estimateTokens(options),
outputTokens: estimateTokens(response.message),
};
await this.logUsage(usage);
return response;
}
}
```
## 📦 Installation & Setup
### Prerequisites
- Node.js 16+
- TypeScript 4.5+
- API keys for your chosen providers
### Environment Setup
```bash
# Install
npm install @push.rocks/smartai
# Set up environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export PERPLEXITY_API_KEY=pplx-...
export ELEVENLABS_API_KEY=sk-...
# ... etc
```
### TypeScript Configuration
```json
{
"compilerOptions": {
"target": "ES2022",
"module": "NodeNext",
"lib": ["ES2022"],
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true
}
}
```
## 🎯 Choosing the Right Provider
| Use Case | Recommended Provider | Why |
| --------------------- | -------------------- | --------------------------------------------------------- |
| **General Purpose** | OpenAI | Most features, stable, well-documented |
| **Complex Reasoning** | Anthropic | Superior logical thinking, safer outputs |
| **Complex Reasoning** | Anthropic | Superior logical thinking, extended thinking, safer |
| **Document OCR** | Mistral | Native PDF processing, no image conversion overhead |
| **Research & Facts** | Perplexity | Web-aware, provides citations |
| **Deep Research** | OpenAI | Deep Research API with comprehensive analysis |
| **Premium TTS** | ElevenLabs | Most natural voices, 70+ languages, superior quality (v3) |
| **Premium TTS** | ElevenLabs | Most natural voices, 70+ languages, v3 model |
| **Speed Critical** | Groq | 10x faster inference, sub-second responses |
| **Privacy Critical** | Ollama | 100% local, no data leaves your servers |
| **Real-time Data** | XAI | Access to current information |
| **Real-time Data** | XAI | Grok with access to current information |
| **Cost Sensitive** | Ollama/Exo | Free (local) or distributed compute |
## 📈 Roadmap
- [x] Research & Web Search API
- [x] Image generation support (gpt-image-1, DALL-E 3, DALL-E 2)
- [x] Extended thinking (Anthropic)
- [x] Native PDF OCR (Mistral)
- [ ] Streaming function calls
- [ ] Voice input processing
- [ ] Fine-tuning integration