feat(mistral): add Mistral provider with native PDF OCR and chat integration

This commit is contained in:
2026-01-18 22:11:52 +00:00
parent 6f79dc3535
commit e4dc81edc9
12 changed files with 649 additions and 2 deletions

View File

@@ -5,6 +5,7 @@
- Uses `@git.zone/tstest` v3.x for testing (import from `@git.zone/tstest/tapbundle`)
- `@push.rocks/smartfile` is kept at v11 to avoid migration to factory pattern
- `@anthropic-ai/sdk` v0.71.x with extended thinking support
- `@mistralai/mistralai` v1.x for Mistral OCR and chat capabilities
- `@push.rocks/smartrequest` v5.x - uses `response.stream()` + `Readable.fromWeb()` for streaming
## Important Notes
@@ -12,6 +13,68 @@
- When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
- The `streamNode()` method was removed in smartrequest v5, use `response.stream()` with `Readable.fromWeb()` instead
## Mistral Provider Integration
### Overview
The Mistral provider supports:
- **Document AI** via Mistral OCR 3 (December 2025) - native PDF processing without image conversion
- **Chat capabilities** using Mistral's chat models (`mistral-large-latest`, etc.)
### Key Advantage: Native PDF Support
Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.
### Configuration
```typescript
import * as smartai from '@push.rocks/smartai';
const provider = new smartai.MistralProvider({
mistralToken: 'your-token-here',
chatModel: 'mistral-large-latest', // default
ocrModel: 'mistral-ocr-latest', // default
tableFormat: 'markdown', // 'markdown' or 'html'
});
await provider.start();
```
### Supported Methods
| Method | Support | Notes |
|--------|---------|-------|
| `chat()` | ✅ | Standard chat completion |
| `chatStream()` | ✅ | Streaming chat responses |
| `document()` | ✅ | Native PDF OCR - no image conversion needed |
| `vision()` | ✅ | Image OCR with optional chat analysis |
| `audio()` | ❌ | Not supported - use ElevenLabs |
| `research()` | ❌ | Not supported - use Perplexity |
| `imageGenerate()` | ❌ | Not supported - use OpenAI |
| `imageEdit()` | ❌ | Not supported - use OpenAI |
### Document Processing
The `document()` method uses Mistral OCR to extract text from PDFs, then uses Mistral chat to process the user's query with the extracted content.
```typescript
const result = await provider.document({
systemMessage: 'You are a document analyst.',
userMessage: 'Summarize this document.',
pdfDocuments: [pdfBuffer],
messageHistory: [],
});
```
### API Key
Tests require `MISTRAL_API_KEY` in `.nogit/env.json`.
### Pricing (as of December 2025)
- OCR: $2 per 1,000 pages ($1 with Batch API)
- Chat: Varies by model (see Mistral pricing page)
## Anthropic Extended Thinking Feature
### Overview