BREAKING CHANGE(vercel-ai-sdk): migrate to Vercel AI SDK v6 and introduce provider registry (getModel) returning LanguageModelV3
This commit is contained in:
130
readme.hints.md
130
readme.hints.md
@@ -1,104 +1,50 @@
|
||||
# SmartAI Project Hints
|
||||
|
||||
## Architecture (v1.0.0 - Vercel AI SDK rewrite)
|
||||
|
||||
The package is a **provider registry** built on the Vercel AI SDK (`ai` v6). The core export returns a `LanguageModelV3` from `@ai-sdk/provider`. Specialized capabilities are in subpath exports.
|
||||
|
||||
### Core Entry (`ts/`)
|
||||
- `getModel(options)` → returns `LanguageModelV3` for any supported provider
|
||||
- Providers: anthropic, openai, google, groq, mistral, xai, perplexity, ollama
|
||||
- Anthropic prompt caching via `wrapLanguageModel` middleware (enabled by default)
|
||||
- Custom Ollama provider implementing `LanguageModelV3` directly (for think, num_ctx support)
|
||||
|
||||
### Subpath Exports
|
||||
- `@push.rocks/smartai/vision` — `analyzeImage()` using `generateText` with image content
|
||||
- `@push.rocks/smartai/audio` — `textToSpeech()` using OpenAI SDK directly
|
||||
- `@push.rocks/smartai/image` — `generateImage()`, `editImage()` using OpenAI SDK directly
|
||||
- `@push.rocks/smartai/document` — `analyzeDocuments()` using SmartPdf + `generateText`
|
||||
- `@push.rocks/smartai/research` — `research()` using `@anthropic-ai/sdk` web_search tool
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Uses `@git.zone/tstest` v3.x for testing (import from `@git.zone/tstest/tapbundle`)
|
||||
- `@push.rocks/smartfs` v1.x for file system operations
|
||||
- `@anthropic-ai/sdk` v0.71.x with extended thinking support
|
||||
- `@mistralai/mistralai` v1.x for Mistral OCR and chat capabilities
|
||||
- `openai` v6.x for OpenAI API integration
|
||||
- `@push.rocks/smartrequest` v5.x - uses `response.stream()` + `Readable.fromWeb()` for streaming
|
||||
- `ai` ^6.0.116 — Vercel AI SDK core
|
||||
- `@ai-sdk/*` — Provider packages (anthropic, openai, google, groq, mistral, xai, perplexity)
|
||||
- `@ai-sdk/provider` ^3.0.8 — LanguageModelV3 types
|
||||
- `@anthropic-ai/sdk` ^0.78.0 — Direct SDK for research (web search tool)
|
||||
- `openai` ^6.25.0 — Direct SDK for audio TTS and image generation/editing
|
||||
- `@push.rocks/smartpdf` ^4.1.3 — PDF to PNG conversion for document analysis
|
||||
|
||||
## Build
|
||||
|
||||
- `pnpm build` → `tsbuild tsfolders --allowimplicitany`
|
||||
- Compiles: ts/, ts_vision/, ts_audio/, ts_image/, ts_document/, ts_research/
|
||||
|
||||
## Important Notes
|
||||
|
||||
- When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
|
||||
- The `streamNode()` method was removed in smartrequest v5, use `response.stream()` with `Readable.fromWeb()` instead
|
||||
|
||||
## Provider Capabilities Summary
|
||||
|
||||
| Provider | Chat | Stream | TTS | Vision | Documents | Research | Images |
|
||||
|--------------|------|--------|-----|--------|-----------|----------|--------|
|
||||
| OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Anthropic | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
|
||||
| Mistral | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
|
||||
| ElevenLabs | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
|
||||
| Ollama | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ |
|
||||
| XAI | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
| Perplexity | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |
|
||||
| Groq | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
||||
| Exo | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
||||
|
||||
## Mistral Provider Integration
|
||||
|
||||
### Overview
|
||||
|
||||
The Mistral provider supports:
|
||||
- **Document AI** via Mistral OCR (December 2025) - native PDF processing without image conversion
|
||||
- **Chat capabilities** using Mistral's chat models (`mistral-large-latest`, etc.)
|
||||
|
||||
### Key Advantage: Native PDF Support
|
||||
|
||||
Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.
|
||||
|
||||
### Configuration
|
||||
|
||||
```typescript
|
||||
import * as smartai from '@push.rocks/smartai';
|
||||
|
||||
const provider = new smartai.MistralProvider({
|
||||
mistralToken: 'your-token-here',
|
||||
chatModel: 'mistral-large-latest', // default
|
||||
ocrModel: 'mistral-ocr-latest', // default
|
||||
tableFormat: 'markdown', // 'markdown' or 'html'
|
||||
});
|
||||
|
||||
await provider.start();
|
||||
```
|
||||
|
||||
### API Key
|
||||
|
||||
Tests require `MISTRAL_API_KEY` in `.nogit/env.json`.
|
||||
|
||||
## Anthropic Extended Thinking Feature
|
||||
|
||||
### Configuration
|
||||
|
||||
Extended thinking is configured at the provider level during instantiation:
|
||||
|
||||
```typescript
|
||||
import * as smartai from '@push.rocks/smartai';
|
||||
|
||||
const provider = new smartai.AnthropicProvider({
|
||||
anthropicToken: 'your-token-here',
|
||||
extendedThinking: 'normal', // Options: 'quick' | 'normal' | 'deep' | 'off'
|
||||
});
|
||||
```
|
||||
|
||||
### Thinking Modes
|
||||
|
||||
| Mode | Budget Tokens | Use Case |
|
||||
| ---------- | ------------- | ----------------------------------------------- |
|
||||
| `'quick'` | 2,048 | Lightweight reasoning for simple queries |
|
||||
| `'normal'` | 8,000 | **Default** - Balanced reasoning for most tasks |
|
||||
| `'deep'` | 16,000 | Complex reasoning for difficult problems |
|
||||
| `'off'` | 0 | Disable extended thinking |
|
||||
|
||||
### Implementation Details
|
||||
|
||||
- Extended thinking is implemented via `getThinkingConfig()` private method
|
||||
- When thinking is enabled, temperature must NOT be set
|
||||
- Uses `claude-sonnet-4-5-20250929` model
|
||||
- LanguageModelV3 uses `unified`/`raw` in FinishReason (not `type`/`rawType`)
|
||||
- LanguageModelV3 system messages have `content: string` (not array)
|
||||
- LanguageModelV3 file parts use `mediaType` (not `mimeType`)
|
||||
- LanguageModelV3FunctionTool uses `inputSchema` (not `parameters`)
|
||||
- Ollama `think` param goes at request body top level, not inside `options`
|
||||
- Qwen models get default temperature 0.55 in the custom Ollama provider
|
||||
- `qenv.getEnvVarOnDemand()` returns a Promise — must be awaited in tests
|
||||
|
||||
## Testing
|
||||
|
||||
Run tests with:
|
||||
|
||||
```bash
|
||||
pnpm test
|
||||
```
|
||||
|
||||
Run specific tests:
|
||||
|
||||
```bash
|
||||
npx tstest test/test.something.ts --verbose
|
||||
pnpm test # all tests
|
||||
tstest test/test.smartai.ts --verbose # core tests
|
||||
tstest test/test.ollama.ts --verbose # ollama provider tests (mocked, no API needed)
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user