feat(vision): add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting

2026-01-18 03:35:05 +00:00
parent d237ad19f4
commit 3780105c6f
6 changed files with 435 additions and 70 deletions
--- a/changelog.md
+++ b/changelog.md
@@ -1,5 +1,13 @@
 # Changelog

+## 2026-01-18 - 1.10.0 - feat(vision)
+add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting
+
+- Add Dockerfile_qwen3vl to provide an Ollama-based image for Qwen3-VL and expose the Ollama API on port 11434
+- Introduce test/test.invoices.qwen3vl.ts and ensureQwen3Vl() helper to pull and test qwen3-vl:8b
+- Improve PDF->PNG conversion and prompt in ministral3 tests (higher DPI, max quality, sharpen) and increase num_predict from 512 to 1024
+- Simplify extraction pipeline: remove consensus voting, log single-pass results, and simplify OCR HTML sanitization/truncation logic
+
 ## 2026-01-18 - 1.9.0 - feat(tests)
 add Ministral 3 vision tests and improve invoice extraction pipeline to use Ollama chat schema, sanitization, and multi-page support