feat(mistral): add Mistral provider with native PDF OCR and chat integration
This commit is contained in:
@@ -5,6 +5,7 @@
|
||||
- Uses `@git.zone/tstest` v3.x for testing (import from `@git.zone/tstest/tapbundle`)
|
||||
- `@push.rocks/smartfile` is kept at v11 to avoid migration to factory pattern
|
||||
- `@anthropic-ai/sdk` v0.71.x with extended thinking support
|
||||
- `@mistralai/mistralai` v1.x for Mistral OCR and chat capabilities
|
||||
- `@push.rocks/smartrequest` v5.x - uses `response.stream()` + `Readable.fromWeb()` for streaming
|
||||
|
||||
## Important Notes
|
||||
@@ -12,6 +13,68 @@
|
||||
- When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
|
||||
- The `streamNode()` method was removed in smartrequest v5, use `response.stream()` with `Readable.fromWeb()` instead
|
||||
|
||||
## Mistral Provider Integration
|
||||
|
||||
### Overview
|
||||
|
||||
The Mistral provider supports:
|
||||
- **Document AI** via Mistral OCR 3 (December 2025) - native PDF processing without image conversion
|
||||
- **Chat capabilities** using Mistral's chat models (`mistral-large-latest`, etc.)
|
||||
|
||||
### Key Advantage: Native PDF Support
|
||||
|
||||
Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.
|
||||
|
||||
### Configuration
|
||||
|
||||
```typescript
|
||||
import * as smartai from '@push.rocks/smartai';
|
||||
|
||||
const provider = new smartai.MistralProvider({
|
||||
mistralToken: 'your-token-here',
|
||||
chatModel: 'mistral-large-latest', // default
|
||||
ocrModel: 'mistral-ocr-latest', // default
|
||||
tableFormat: 'markdown', // 'markdown' or 'html'
|
||||
});
|
||||
|
||||
await provider.start();
|
||||
```
|
||||
|
||||
### Supported Methods
|
||||
|
||||
| Method | Support | Notes |
|
||||
|--------|---------|-------|
|
||||
| `chat()` | ✅ | Standard chat completion |
|
||||
| `chatStream()` | ✅ | Streaming chat responses |
|
||||
| `document()` | ✅ | Native PDF OCR - no image conversion needed |
|
||||
| `vision()` | ✅ | Image OCR with optional chat analysis |
|
||||
| `audio()` | ❌ | Not supported - use ElevenLabs |
|
||||
| `research()` | ❌ | Not supported - use Perplexity |
|
||||
| `imageGenerate()` | ❌ | Not supported - use OpenAI |
|
||||
| `imageEdit()` | ❌ | Not supported - use OpenAI |
|
||||
|
||||
### Document Processing
|
||||
|
||||
The `document()` method uses Mistral OCR to extract text from PDFs, then uses Mistral chat to process the user's query with the extracted content.
|
||||
|
||||
```typescript
|
||||
const result = await provider.document({
|
||||
systemMessage: 'You are a document analyst.',
|
||||
userMessage: 'Summarize this document.',
|
||||
pdfDocuments: [pdfBuffer],
|
||||
messageHistory: [],
|
||||
});
|
||||
```
|
||||
|
||||
### API Key
|
||||
|
||||
Tests require `MISTRAL_API_KEY` in `.nogit/env.json`.
|
||||
|
||||
### Pricing (as of December 2025)
|
||||
|
||||
- OCR: $2 per 1,000 pages ($1 with Batch API)
|
||||
- Chat: Varies by model (see Mistral pricing page)
|
||||
|
||||
## Anthropic Extended Thinking Feature
|
||||
|
||||
### Overview
|
||||
|
||||
Reference in New Issue
Block a user