feat(vision): add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting
This commit is contained in:
@@ -1,5 +1,13 @@
|
||||
# Changelog
|
||||
|
||||
## 2026-01-18 - 1.10.0 - feat(vision)
|
||||
add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting
|
||||
|
||||
- Add Dockerfile_qwen3vl to provide an Ollama-based image for Qwen3-VL and expose the Ollama API on port 11434
|
||||
- Introduce test/test.invoices.qwen3vl.ts and ensureQwen3Vl() helper to pull and test qwen3-vl:8b
|
||||
- Improve PDF->PNG conversion and prompt in ministral3 tests (higher DPI, max quality, sharpen) and increase num_predict from 512 to 1024
|
||||
- Simplify extraction pipeline: remove consensus voting, log single-pass results, and simplify OCR HTML sanitization/truncation logic
|
||||
|
||||
## 2026-01-18 - 1.9.0 - feat(tests)
|
||||
add Ministral 3 vision tests and improve invoice extraction pipeline to use Ollama chat schema, sanitization, and multi-page support
|
||||
|
||||
|
||||
Reference in New Issue
Block a user