|
|
08728ada4d
|
feat(docker-images): add vLLM-based Nanonets-OCR2-3B image, Qwen3-VL Ollama image and refactor build/docs/tests to use new runtime/layout
|
2026-01-19 21:05:51 +00:00 |
|
|
|
ae28a64902
|
fix(tests): stabilize OCR extraction tests and manage GPU containers
|
2026-01-18 23:00:24 +00:00 |
|
|
|
17ea7717eb
|
fix(image_support_files): remove PaddleOCR-VL server scripts from image_support_files
|
2026-01-18 13:58:26 +00:00 |
|
|
|
d91df70fff
|
feat(tests): revamp tests and remove legacy Dockerfiles: adopt JSON/consensus workflows, switch MiniCPM model, and delete deprecated Docker/test variants
|
2026-01-18 13:56:46 +00:00 |
|
|
|
76b21f1f7b
|
feat(tests): switch vision tests to multi-query extraction (count then per-row/field queries) and add logging/summaries
|
2026-01-18 11:26:38 +00:00 |
|
|
|
e76768da55
|
feat(vision): process pages separately and make Qwen3-VL vision extraction more robust; add per-page parsing, safer JSON handling, reduced token usage, and multi-query invoice extraction
|
2026-01-18 04:50:57 +00:00 |
|
|
|
7c8f10497e
|
fix(tests): improve Qwen3-VL invoice extraction test by switching to non-stream API, adding model availability/pull checks, simplifying response parsing, and tightening model options
|
2026-01-18 04:17:30 +00:00 |
|
|
|
3780105c6f
|
feat(vision): add Qwen3-VL vision model support with Dockerfile and tests; improve invoice OCR conversion and prompts; simplify extraction flow by removing consensus voting
|
2026-01-18 03:35:05 +00:00 |
|
|
|
7652a2df52
|
feat(tests): add Ministral 3 vision tests and improve invoice extraction pipeline to use Ollama chat schema, sanitization, and multi-page support
|
2026-01-18 02:53:24 +00:00 |
|
|
|
f0d88fcbe0
|
feat(paddleocr-vl): add structured HTML output and table parsing for PaddleOCR-VL, update API, tests, and README
|
2026-01-18 00:11:17 +00:00 |
|
|
|
5a311dca2d
|
fix(docker): standardize Dockerfile and entrypoint filenames; add GPU-specific Dockerfiles and update build and test references
|
2026-01-17 23:13:47 +00:00 |
|
|
|
30c73b24c1
|
feat(tests): use Qwen2.5 (Ollama) for invoice extraction tests and add helpers for model management; normalize dates and coerce numeric fields
|
2026-01-17 21:50:09 +00:00 |
|
|
|
80e6866442
|
feat(paddleocr-vl): add PaddleOCR-VL full pipeline Docker image and API server, plus integration tests and docker helpers
|
2026-01-17 20:22:23 +00:00 |
|
|
|
0482c35b69
|
feat(paddleocr-vl): add PaddleOCR-VL GPU Dockerfile, pin vllm, update CPU image deps, and improve entrypoint and tests
|
2026-01-17 16:57:26 +00:00 |
|
|
|
82358b2d5d
|
feat(invoices): add hybrid OCR + vision invoice/document parsing with PaddleOCR, consensus voting, and prompt/test refactors
|
2026-01-16 14:24:37 +00:00 |
|
|
|
bec379e9ca
|
feat(paddleocr): add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows
|
2026-01-16 13:23:01 +00:00 |
|
|
|
ae4bb26931
|
feat(paddleocr): add PaddleOCR support: Docker images, FastAPI server, entrypoint and tests
|
2026-01-16 10:23:32 +00:00 |
|
|
|
379b5c19eb
|
feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing
|
2026-01-16 10:22:15 +00:00 |
|