feat(tests): use Qwen2.5 (Ollama) for invoice extraction tests and add helpers for model management; normalize dates and coerce numeric fields

2026-01-17 21:50:09 +00:00
parent 311e7a8fd4
commit 30c73b24c1
3 changed files with 165 additions and 34 deletions
--- a/changelog.md
+++ b/changelog.md
@@ -1,5 +1,15 @@
 # Changelog

+## 2026-01-17 - 1.7.0 - feat(tests)
+use Qwen2.5 (Ollama) for invoice extraction tests and add helpers for model management; normalize dates and coerce numeric fields
+
+- Added ensureOllamaModel and ensureQwen25 test helpers to pull/check Ollama models via localhost:11434
+- Updated invoices test to use qwen2.5:7b instead of MiniCPM and removed image payload from the text-only extraction step
+- Increased Markdown truncate limit from 8000 to 12000 and reduced model num_predict from 2048 to 512
+- Rewrote extraction prompt to require strict JSON output and added post-processing to parse/convert numeric fields
+- Added normalizeDate and improved compareInvoice to normalize dates and handle numeric formatting/tolerance
+- Updated test setup to ensure Qwen2.5 is available and adjusted logging/messages to reflect the Qwen2.5-based workflow
+
 ## 2026-01-17 - 1.6.0 - feat(paddleocr-vl)
 add PaddleOCR-VL full pipeline Docker image and API server, plus integration tests and docker helpers