Files

Juergen Kunz 126e9b239b feat(OllamaProvider): add model options, streaming support, and thinking tokens

- Add IOllamaModelOptions interface for runtime options (num_ctx, temperature, etc.)
- Extend IOllamaProviderOptions with defaultOptions and defaultTimeout
- Add IOllamaChatOptions for per-request overrides
- Add IOllamaStreamChunk and IOllamaChatResponse interfaces
- Add chatStreamResponse() for async iteration with options
- Add collectStreamResponse() for streaming with progress callback
- Add chatWithOptions() for non-streaming with full options
- Update chat() to use defaultOptions and defaultTimeout

2026-01-20 00:02:45 +00:00

8.5 KiB

Raw Blame History

SmartAI Project Hints

Dependencies

Uses @git.zone/tstest v3.x for testing (import from @git.zone/tstest/tapbundle)
@push.rocks/smartfs v1.x for file system operations (replaced smartfile)
@anthropic-ai/sdk v0.71.x with extended thinking support
@mistralai/mistralai v1.x for Mistral OCR and chat capabilities
openai v6.x for OpenAI API integration
@push.rocks/smartrequest v5.x - uses response.stream() + Readable.fromWeb() for streaming

Important Notes

When extended thinking is enabled, temperature parameter must NOT be set (or set to 1)
The streamNode() method was removed in smartrequest v5, use response.stream() with Readable.fromWeb() instead

Mistral Provider Integration

Overview

The Mistral provider supports:

Document AI via Mistral OCR 3 (December 2025) - native PDF processing without image conversion
Chat capabilities using Mistral's chat models (mistral-large-latest, etc.)

Key Advantage: Native PDF Support

Unlike other providers that require converting PDFs to images (using SmartPdf), Mistral OCR natively accepts PDF documents as base64-encoded data. This makes document processing potentially faster and more accurate for text extraction.

Configuration

import * as smartai from '@push.rocks/smartai';

const provider = new smartai.MistralProvider({
  mistralToken: 'your-token-here',
  chatModel: 'mistral-large-latest',  // default
  ocrModel: 'mistral-ocr-latest',     // default
  tableFormat: 'markdown',             // 'markdown' or 'html'
});

await provider.start();

Supported Methods

Method	Support	Notes
`chat()`	✅	Standard chat completion
`chatStream()`	✅	Streaming chat responses
`document()`	✅	Native PDF OCR - no image conversion needed
`vision()`	✅	Image OCR with optional chat analysis
`audio()`	❌	Not supported - use ElevenLabs
`research()`	❌	Not supported - use Perplexity
`imageGenerate()`	❌	Not supported - use OpenAI
`imageEdit()`	❌	Not supported - use OpenAI

Document Processing

The document() method uses Mistral OCR to extract text from PDFs, then uses Mistral chat to process the user's query with the extracted content.

const result = await provider.document({
  systemMessage: 'You are a document analyst.',
  userMessage: 'Summarize this document.',
  pdfDocuments: [pdfBuffer],
  messageHistory: [],
});

API Key

Tests require MISTRAL_API_KEY in .nogit/env.json.

Pricing (as of December 2025)

OCR: $2 per 1,000 pages ($1 with Batch API)
Chat: Varies by model (see Mistral pricing page)

Anthropic Extended Thinking Feature

Overview

The Anthropic provider now supports extended thinking by default across all methods. Extended thinking enables Claude to spend more time reasoning about complex problems before generating responses, leading to higher quality answers for difficult questions.

Configuration

Extended thinking is configured at the provider level during instantiation:

import * as smartai from '@push.rocks/smartai';

const provider = new smartai.AnthropicProvider({
  anthropicToken: 'your-token-here',
  extendedThinking: 'normal', // Options: 'quick' | 'normal' | 'deep' | 'off'
});

Thinking Modes

The extendedThinking parameter accepts four modes:

Mode	Budget Tokens	Use Case
`'quick'`	2,048	Lightweight reasoning for simple queries
`'normal'`	8,000	Default - Balanced reasoning for most tasks
`'deep'`	16,000	Complex reasoning for difficult problems
`'off'`	0	Disable extended thinking

Default Behavior: If extendedThinking is not specified, it defaults to 'normal' mode (8,000 tokens).

Supported Methods

Extended thinking is automatically applied to all Anthropic provider methods:

chat() - Synchronous chat
chatStream() - Streaming chat
vision() - Image analysis
document() - PDF document processing
research() - Web research with citations

Token Budget Constraints

Important: The thinking budget must be less than max_tokens for the API call. The current max_tokens values are:

chatStream(): 20,000 tokens (sufficient for all modes ✓)
chat(): 20,000 tokens (sufficient for all modes ✓)
vision(): 10,000 tokens (sufficient for all modes ✓)
document(): 20,000 tokens (sufficient for all modes ✓)
research(): 20,000 tokens for all searchDepth levels (sufficient ✓)

Performance and Cost Implications

Token Usage:

You are charged for the full thinking tokens generated, not just the summary
Higher thinking budgets may result in more thorough reasoning but increased costs
The budget is a target, not a strict limit - actual usage may vary

Response Quality:

'quick': Fast responses, basic reasoning
'normal': Good balance between quality and speed (recommended for most use cases)
'deep': Highest quality reasoning for complex problems, slower responses

Recommendations:

Start with 'normal' (default) for general usage
Use 'deep' for complex analytical tasks, philosophy, mathematics, or research
Use 'quick' for simple factual queries where deep reasoning isn't needed
Use 'off' only if you want traditional Claude behavior without extended thinking

Usage Examples

Example 1: Default (Normal Mode)

const provider = new smartai.AnthropicProvider({
  anthropicToken: process.env.ANTHROPIC_TOKEN,
  // extendedThinking defaults to 'normal'
});

await provider.start();

const response = await provider.chat({
  systemMessage: 'You are a helpful assistant.',
  userMessage: 'Explain the implications of quantum computing.',
  messageHistory: [],
});

Example 2: Deep Thinking for Complex Analysis

const provider = new smartai.AnthropicProvider({
  anthropicToken: process.env.ANTHROPIC_TOKEN,
  extendedThinking: 'deep', // 16,000 token budget
});

await provider.start();

const response = await provider.chat({
  systemMessage: 'You are a philosopher and ethicist.',
  userMessage: 'Analyze the trolley problem from multiple ethical frameworks.',
  messageHistory: [],
});

Example 3: Quick Mode for Simple Queries

const provider = new smartai.AnthropicProvider({
  anthropicToken: process.env.ANTHROPIC_TOKEN,
  extendedThinking: 'quick', // 2,048 token budget
});

await provider.start();

const response = await provider.chat({
  systemMessage: 'You are a helpful assistant.',
  userMessage: 'What is the capital of France?',
  messageHistory: [],
});

Example 4: Disable Thinking

const provider = new smartai.AnthropicProvider({
  anthropicToken: process.env.ANTHROPIC_TOKEN,
  extendedThinking: 'off', // No extended thinking
});

await provider.start();

const response = await provider.chat({
  systemMessage: 'You are a helpful assistant.',
  userMessage: 'Tell me a joke.',
  messageHistory: [],
});

Example 5: Extended Thinking with Vision

const provider = new smartai.AnthropicProvider({
  anthropicToken: process.env.ANTHROPIC_TOKEN,
  extendedThinking: 'normal',
});

await provider.start();

const imageBuffer = await fs.promises.readFile('./image.jpg');
const analysis = await provider.vision({
  image: imageBuffer,
  prompt: 'Analyze this image in detail and explain what you see.',
});

Testing

Comprehensive tests for extended thinking are available in:

test/test.thinking.anthropic.ts - Tests all thinking modes

Run tests with:

pnpm test

Run specific thinking tests:

npx tstest test/test.thinking.anthropic.ts --verbose

API Reference

According to Anthropic's documentation:

Extended thinking is supported on Claude Sonnet 4.5, 4, 3.7, Haiku 4.5, and Opus 4.1, 4
The current model used is claude-sonnet-4-5-20250929
Minimum thinking budget is 1,024 tokens
Thinking budget must be less than max_tokens

Implementation Details

The extended thinking feature is implemented via:

Interface: IAnthropicProviderOptions.extendedThinking property
Helper Method: getThinkingConfig() private method that maps modes to token budgets
API Parameter: Adds thinking: { type: 'enabled', budget_tokens: number } to all API calls

The thinking configuration is applied automatically to all API calls when the provider is instantiated.

8.5 KiB Raw Blame History

SmartAI Project Hints

Dependencies

Important Notes

Mistral Provider Integration

Overview

Key Advantage: Native PDF Support

Configuration

Supported Methods

Document Processing

API Key

Pricing (as of December 2025)

Anthropic Extended Thinking Feature

Overview

Configuration

Thinking Modes

Supported Methods

Token Budget Constraints

Performance and Cost Implications

Usage Examples

Example 1: Default (Normal Mode)

Example 2: Deep Thinking for Complex Analysis

Example 3: Quick Mode for Simple Queries

Example 4: Disable Thinking

Example 5: Extended Thinking with Vision

Testing

API Reference

Implementation Details

8.5 KiB

Raw Blame History