BREAKING CHANGE(vercel-ai-sdk): migrate to Vercel AI SDK v6 and introduce provider registry (getModel) returning LanguageModelV3

2026-03-05 19:37:29 +00:00
parent 27cef60900
commit c24010c9bc
61 changed files with 4789 additions and 9083 deletions
@@ -1,104 +1,50 @@
 # SmartAI Project Hints

+## Architecture (v1.0.0 - Vercel AI SDK rewrite)
+
+The package is a **provider registry** built on the Vercel AI SDK (`ai` v6). The core export returns a `LanguageModelV3` from `@ai-sdk/provider`. Specialized capabilities are in subpath exports.
+
+### Core Entry (`ts/`)
+- `getModel(options)` → returns `LanguageModelV3` for any supported provider
+- Providers: anthropic, openai, google, groq, mistral, xai, perplexity, ollama
+- Anthropic prompt caching via `wrapLanguageModel` middleware (enabled by default)
+- Custom Ollama provider implementing `LanguageModelV3` directly (for think, num_ctx support)
+
+### Subpath Exports
+- `@push.rocks/smartai/vision` — `analyzeImage()` using `generateText` with image content
+- `@push.rocks/smartai/audio` — `textToSpeech()` using OpenAI SDK directly
+- `@push.rocks/smartai/image` — `generateImage()`, `editImage()` using OpenAI SDK directly
+- `@push.rocks/smartai/document` — `analyzeDocuments()` using SmartPdf + `generateText`
+- `@push.rocks/smartai/research` — `research()` using `@anthropic-ai/sdk` web_search tool
+
 ## Dependencies

- Uses `@git.zone/tstest` v3.x for testing (import from `@git.zone/tstest/tapbundle`)
- `@push.rocks/smartfs` v1.x for file system operations
- `@anthropic-ai/sdk` v0.71.x with extended thinking support
- `@mistralai/mistralai` v1.x for Mistral OCR and chat capabilities
- `openai` v6.x for OpenAI API integration
- `@push.rocks/smartrequest` v5.x - uses `response.stream()` + `Readable.fromWeb()` for streaming
+- `ai` ^6.0.116 — Vercel AI SDK core
+- `@ai-sdk/*` — Provider packages (anthropic, openai, google, groq, mistral, xai, perplexity)
+- `@ai-sdk/provider` ^3.0.8 — LanguageModelV3 types
+- `@anthropic-ai/sdk` ^0.78.0 — Direct SDK for research (web search tool)
+- `openai` ^6.25.0 — Direct SDK for audio TTS and image generation/editing
+- `@push.rocks/smartpdf` ^4.1.3 — PDF to PNG conversion for document analysis
+
+## Build
+
+- `pnpm build` → `tsbuild tsfolders --allowimplicitany`
+- Compiles: ts/, ts_vision/, ts_audio/, ts_image/, ts_document/, ts_research/

 ## Important Notes

- When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
- The `streamNode()` method was removed in smartrequest v5, use `response.stream()` with `Readable.fromWeb()` instead
-
-## Provider Capabilities Summary
-
-| Provider     | Chat | Stream | TTS | Vision | Documents | Research | Images |
-|--------------|------|--------|-----|--------|-----------|----------|--------|
-| OpenAI       | ✅   | ✅     | ✅  | ✅     | ✅        | ✅       | ✅     |
-| Anthropic    | ✅   | ✅     | ❌  | ✅     | ✅        | ✅       | ❌     |
-| Mistral      | ✅   | ✅     | ❌  | ✅     | ✅        | ❌       | ❌     |
-| ElevenLabs   | ❌   | ❌     | ✅  | ❌     | ❌        | ❌       | ❌     |
-| Ollama       | ✅   | ✅     | ❌  | ✅     | ✅        | ❌       | ❌     |
-| XAI          | ✅   | ✅     | ❌  | ❌     | ✅        | ❌       | ❌     |
-| Perplexity   | ✅   | ✅     | ❌  | ❌     | ❌        | ✅       | ❌     |
-| Groq         | ✅   | ✅     | ❌  | ❌     | ❌        | ❌       | ❌     |
-| Exo          | ✅   | ✅     | ❌  | ❌     | ❌        | ❌       | ❌     |
-
-## Mistral Provider Integration
-
-### Overview
-
-The Mistral provider supports:
- **Document AI** via Mistral OCR (December 2025) - native PDF processing without image conversion
- **Chat capabilities** using Mistral's chat models (`mistral-large-latest`, etc.)
-
-### Key Advantage: Native PDF Support
-
-Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.
-
-### Configuration
-
-```typescript
-import * as smartai from '@push.rocks/smartai';
-
-const provider = new smartai.MistralProvider({
-  mistralToken: 'your-token-here',
-  chatModel: 'mistral-large-latest',  // default
-  ocrModel: 'mistral-ocr-latest',     // default
-  tableFormat: 'markdown',             // 'markdown' or 'html'
-});
-
-await provider.start();
-```
-
-### API Key
-
-Tests require `MISTRAL_API_KEY` in `.nogit/env.json`.
-
-## Anthropic Extended Thinking Feature
-
-### Configuration
-
-Extended thinking is configured at the provider level during instantiation:
-
-```typescript
-import * as smartai from '@push.rocks/smartai';
-
-const provider = new smartai.AnthropicProvider({
-  anthropicToken: 'your-token-here',
-  extendedThinking: 'normal', // Options: 'quick' | 'normal' | 'deep' | 'off'
-});
-```
-
-### Thinking Modes
-
-| Mode       | Budget Tokens | Use Case                                        |
-| ---------- | ------------- | ----------------------------------------------- |
-| `'quick'`  | 2,048         | Lightweight reasoning for simple queries        |
-| `'normal'` | 8,000         | **Default** - Balanced reasoning for most tasks |
-| `'deep'`   | 16,000        | Complex reasoning for difficult problems        |
-| `'off'`    | 0             | Disable extended thinking                       |
-
-### Implementation Details
-
- Extended thinking is implemented via `getThinkingConfig()` private method
- When thinking is enabled, temperature must NOT be set
- Uses `claude-sonnet-4-5-20250929` model
+- LanguageModelV3 uses `unified`/`raw` in FinishReason (not `type`/`rawType`)
+- LanguageModelV3 system messages have `content: string` (not array)
+- LanguageModelV3 file parts use `mediaType` (not `mimeType`)
+- LanguageModelV3FunctionTool uses `inputSchema` (not `parameters`)
+- Ollama `think` param goes at request body top level, not inside `options`
+- Qwen models get default temperature 0.55 in the custom Ollama provider
+- `qenv.getEnvVarOnDemand()` returns a Promise — must be awaited in tests

 ## Testing

-Run tests with:
-
 ```bash
-pnpm test
-```
-
-Run specific tests:
-
-```bash
-npx tstest test/test.something.ts --verbose
+pnpm test                            # all tests
+tstest test/test.smartai.ts --verbose # core tests
+tstest test/test.ollama.ts --verbose  # ollama provider tests (mocked, no API needed)
 ```