BREAKING CHANGE(vercel-ai-sdk): migrate to Vercel AI SDK v6 and introduce provider registry (getModel) returning LanguageModelV3

This commit is contained in:
2026-03-05 19:37:29 +00:00
parent 27cef60900
commit c24010c9bc
61 changed files with 4789 additions and 9083 deletions

View File

@@ -1,104 +1,50 @@
# SmartAI Project Hints
## Architecture (v1.0.0 - Vercel AI SDK rewrite)
The package is a **provider registry** built on the Vercel AI SDK (`ai` v6). The core export returns a `LanguageModelV3` from `@ai-sdk/provider`. Specialized capabilities are in subpath exports.
### Core Entry (`ts/`)
- `getModel(options)` → returns `LanguageModelV3` for any supported provider
- Providers: anthropic, openai, google, groq, mistral, xai, perplexity, ollama
- Anthropic prompt caching via `wrapLanguageModel` middleware (enabled by default)
- Custom Ollama provider implementing `LanguageModelV3` directly (for think, num_ctx support)
### Subpath Exports
- `@push.rocks/smartai/vision``analyzeImage()` using `generateText` with image content
- `@push.rocks/smartai/audio``textToSpeech()` using OpenAI SDK directly
- `@push.rocks/smartai/image``generateImage()`, `editImage()` using OpenAI SDK directly
- `@push.rocks/smartai/document``analyzeDocuments()` using SmartPdf + `generateText`
- `@push.rocks/smartai/research``research()` using `@anthropic-ai/sdk` web_search tool
## Dependencies
- Uses `@git.zone/tstest` v3.x for testing (import from `@git.zone/tstest/tapbundle`)
- `@push.rocks/smartfs` v1.x for file system operations
- `@anthropic-ai/sdk` v0.71.x with extended thinking support
- `@mistralai/mistralai` v1.x for Mistral OCR and chat capabilities
- `openai` v6.x for OpenAI API integration
- `@push.rocks/smartrequest` v5.x - uses `response.stream()` + `Readable.fromWeb()` for streaming
- `ai` ^6.0.116 — Vercel AI SDK core
- `@ai-sdk/*` — Provider packages (anthropic, openai, google, groq, mistral, xai, perplexity)
- `@ai-sdk/provider` ^3.0.8 — LanguageModelV3 types
- `@anthropic-ai/sdk` ^0.78.0 — Direct SDK for research (web search tool)
- `openai` ^6.25.0 — Direct SDK for audio TTS and image generation/editing
- `@push.rocks/smartpdf` ^4.1.3 — PDF to PNG conversion for document analysis
## Build
- `pnpm build``tsbuild tsfolders --allowimplicitany`
- Compiles: ts/, ts_vision/, ts_audio/, ts_image/, ts_document/, ts_research/
## Important Notes
- When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
- The `streamNode()` method was removed in smartrequest v5, use `response.stream()` with `Readable.fromWeb()` instead
## Provider Capabilities Summary
| Provider | Chat | Stream | TTS | Vision | Documents | Research | Images |
|--------------|------|--------|-----|--------|-----------|----------|--------|
| OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
| Mistral | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
| ElevenLabs | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Ollama | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
| XAI | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ |
| Perplexity | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |
| Groq | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Exo | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
## Mistral Provider Integration
### Overview
The Mistral provider supports:
- **Document AI** via Mistral OCR (December 2025) - native PDF processing without image conversion
- **Chat capabilities** using Mistral's chat models (`mistral-large-latest`, etc.)
### Key Advantage: Native PDF Support
Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.
### Configuration
```typescript
import * as smartai from '@push.rocks/smartai';
const provider = new smartai.MistralProvider({
mistralToken: 'your-token-here',
chatModel: 'mistral-large-latest', // default
ocrModel: 'mistral-ocr-latest', // default
tableFormat: 'markdown', // 'markdown' or 'html'
});
await provider.start();
```
### API Key
Tests require `MISTRAL_API_KEY` in `.nogit/env.json`.
## Anthropic Extended Thinking Feature
### Configuration
Extended thinking is configured at the provider level during instantiation:
```typescript
import * as smartai from '@push.rocks/smartai';
const provider = new smartai.AnthropicProvider({
anthropicToken: 'your-token-here',
extendedThinking: 'normal', // Options: 'quick' | 'normal' | 'deep' | 'off'
});
```
### Thinking Modes
| Mode | Budget Tokens | Use Case |
| ---------- | ------------- | ----------------------------------------------- |
| `'quick'` | 2,048 | Lightweight reasoning for simple queries |
| `'normal'` | 8,000 | **Default** - Balanced reasoning for most tasks |
| `'deep'` | 16,000 | Complex reasoning for difficult problems |
| `'off'` | 0 | Disable extended thinking |
### Implementation Details
- Extended thinking is implemented via `getThinkingConfig()` private method
- When thinking is enabled, temperature must NOT be set
- Uses `claude-sonnet-4-5-20250929` model
- LanguageModelV3 uses `unified`/`raw` in FinishReason (not `type`/`rawType`)
- LanguageModelV3 system messages have `content: string` (not array)
- LanguageModelV3 file parts use `mediaType` (not `mimeType`)
- LanguageModelV3FunctionTool uses `inputSchema` (not `parameters`)
- Ollama `think` param goes at request body top level, not inside `options`
- Qwen models get default temperature 0.55 in the custom Ollama provider
- `qenv.getEnvVarOnDemand()` returns a Promise — must be awaited in tests
## Testing
Run tests with:
```bash
pnpm test
```
Run specific tests:
```bash
npx tstest test/test.something.ts --verbose
pnpm test # all tests
tstest test/test.smartai.ts --verbose # core tests
tstest test/test.ollama.ts --verbose # ollama provider tests (mocked, no API needed)
```