fix(extraction): improve JSON extraction prompts and model options for invoice and bank statement tests
This commit is contained in:
@@ -1,5 +1,14 @@
|
||||
# Changelog
|
||||
|
||||
## 2026-01-19 - 1.14.1 - fix(extraction)
|
||||
improve JSON extraction prompts and model options for invoice and bank statement tests
|
||||
|
||||
- Refactor JSON extraction prompts to be sent after the document text and add explicit 'WHERE TO FIND DATA' and 'RULES' sections for clearer extraction guidance
|
||||
- Change chat message flow to: send document, assistant acknowledgement, then the JSON extraction prompt (avoids concatenating large prompts into one message)
|
||||
- Add model options (num_ctx: 32768, temperature: 0) to give larger context windows and deterministic JSON output
|
||||
- Simplify logging to avoid printing full prompt contents; log document and prompt lengths instead
|
||||
- Increase timeouts for large documents to 600000ms (10 minutes) where applicable
|
||||
|
||||
## 2026-01-19 - 1.14.0 - feat(docker-images)
|
||||
add vLLM-based Nanonets-OCR2-3B image, Qwen3-VL Ollama image and refactor build/docs/tests to use new runtime/layout
|
||||
|
||||
|
||||
Reference in New Issue
Block a user