ht-docker-ai

Author	SHA1	Message	Date
Juergen Kunz	969d21c51a	fix(tests): enable progress events in invoice tests and bump @push.rocks/smartagent devDependency to ^1.5.4	2026-01-20 03:19:58 +00:00
Juergen Kunz	9bc1f74978	feat(test): enable native tool calling for GPT-OSS invoice extraction - Update smartai to v0.13.2 (native tool calling support) - Update smartagent to v1.5.1 (useNativeToolCalling option) - Enable think: true for GPT-OSS reasoning mode in Ollama config - Enable useNativeToolCalling: true in DualAgentOrchestrator - Simplify driver system message (native tools don't need XML instructions) Native tool calling uses Ollama's built-in Harmony format parser instead of requiring XML generation, which is more efficient for GPT-OSS models.	2026-01-20 02:51:52 +00:00
Juergen Kunz	77d57e80bd	feat(tests): integrate SmartAi/DualAgentOrchestrator into extraction tests and add JSON self-validation	2026-01-20 01:17:41 +00:00
Juergen Kunz	d8bdb18841	fix(test): add JSON validation and retry logic to invoice extraction - Add tryExtractJson function to validate JSON before accepting - Use orchestrator.continueTask() to request correction if JSON is invalid - Retry up to 2 times for malformed JSON responses - Remove duplicate parseJsonToInvoice function	2026-01-20 00:45:30 +00:00
Juergen Kunz	d384c1d79b	feat(tests): integrate smartagent DualAgentOrchestrator with streaming support - Update test.invoices.nanonets.ts to use DualAgentOrchestrator for JSON extraction - Enable streaming token callback for real-time progress visibility - Add markdown caching to avoid re-running Nanonets OCR for cached files - Update test.bankstatements.minicpm.ts and test.invoices.minicpm.ts with streaming - Update dependencies to @push.rocks/smartai@0.11.1 and @push.rocks/smartagent@1.2.8	2026-01-20 00:39:36 +00:00
Juergen Kunz	09770d3177	fix(extraction): improve JSON extraction prompts and model options for invoice and bank statement tests	2026-01-19 21:19:37 +00:00
Juergen Kunz	08728ada4d	feat(docker-images): add vLLM-based Nanonets-OCR2-3B image, Qwen3-VL Ollama image and refactor build/docs/tests to use new runtime/layout	2026-01-19 21:05:51 +00:00
Juergen Kunz	b58bcabc76	update	2026-01-19 11:51:23 +00:00
Juergen Kunz	ae28a64902	fix(tests): stabilize OCR extraction tests and manage GPU containers	2026-01-18 23:00:24 +00:00
Juergen Kunz	09ea7440e8	update	2026-01-18 15:54:16 +00:00
Juergen Kunz	d91df70fff	feat(tests): revamp tests and remove legacy Dockerfiles: adopt JSON/consensus workflows, switch MiniCPM model, and delete deprecated Docker/test variants	2026-01-18 13:56:46 +00:00
Juergen Kunz	76b21f1f7b	feat(tests): switch vision tests to multi-query extraction (count then per-row/field queries) and add logging/summaries	2026-01-18 11:26:38 +00:00
Juergen Kunz	e76768da55	feat(vision): process pages separately and make Qwen3-VL vision extraction more robust; add per-page parsing, safer JSON handling, reduced token usage, and multi-query invoice extraction	2026-01-18 04:50:57 +00:00
Juergen Kunz	63d72a52c9	update	2026-01-18 04:28:57 +00:00
Juergen Kunz	7c8f10497e	fix(tests): improve Qwen3-VL invoice extraction test by switching to non-stream API, adding model availability/pull checks, simplifying response parsing, and tightening model options	2026-01-18 04:17:30 +00:00
Juergen Kunz	3780105c6f	feat(vision): add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting	2026-01-18 03:35:05 +00:00
Juergen Kunz	7652a2df52	feat(tests): add Ministral 3 vision tests and improve invoice extraction pipeline to use Ollama chat schema, sanitization, and multi-page support	2026-01-18 02:53:24 +00:00
Juergen Kunz	f0d88fcbe0	feat(paddleocr-vl): add structured HTML output and table parsing for PaddleOCR-VL, update API, tests, and README	2026-01-18 00:11:17 +00:00
Juergen Kunz	5a311dca2d	fix(docker): standardize Dockerfile and entrypoint filenames; add GPU-specific Dockerfiles and update build and test references	2026-01-17 23:13:47 +00:00
Juergen Kunz	30c73b24c1	feat(tests): use Qwen2.5 (Ollama) for invoice extraction tests and add helpers for model management; normalize dates and coerce numeric fields	2026-01-17 21:50:09 +00:00
Juergen Kunz	80e6866442	feat(paddleocr-vl): add PaddleOCR-VL full pipeline Docker image and API server, plus integration tests and docker helpers	2026-01-17 20:22:23 +00:00
Juergen Kunz	0482c35b69	feat(paddleocr-vl): add PaddleOCR-VL GPU Dockerfile, pin vllm, update CPU image deps, and improve entrypoint and tests	2026-01-17 16:57:26 +00:00
Juergen Kunz	15ac1fcf67	update	2026-01-16 16:21:44 +00:00
Juergen Kunz	82358b2d5d	feat(invoices): add hybrid OCR + vision invoice/document parsing with PaddleOCR, consensus voting, and prompt/test refactors	2026-01-16 14:24:37 +00:00
Juergen Kunz	bec379e9ca	feat(paddleocr): add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows	2026-01-16 13:23:01 +00:00
Juergen Kunz	379b5c19eb	feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing	2026-01-16 10:22:15 +00:00
Juergen Kunz	3dc1881d8b	update	2026-01-16 03:58:39 +00:00

27 Commits