9 Commits

Author SHA1 Message Date
f0d88fcbe0 feat(paddleocr-vl): add structured HTML output and table parsing for PaddleOCR-VL, update API, tests, and README 2026-01-18 00:11:17 +00:00
5a311dca2d fix(docker): standardize Dockerfile and entrypoint filenames; add GPU-specific Dockerfiles and update build and test references 2026-01-17 23:13:47 +00:00
30c73b24c1 feat(tests): use Qwen2.5 (Ollama) for invoice extraction tests and add helpers for model management; normalize dates and coerce numeric fields 2026-01-17 21:50:09 +00:00
80e6866442 feat(paddleocr-vl): add PaddleOCR-VL full pipeline Docker image and API server, plus integration tests and docker helpers 2026-01-17 20:22:23 +00:00
0482c35b69 feat(paddleocr-vl): add PaddleOCR-VL GPU Dockerfile, pin vllm, update CPU image deps, and improve entrypoint and tests 2026-01-17 16:57:26 +00:00
82358b2d5d feat(invoices): add hybrid OCR + vision invoice/document parsing with PaddleOCR, consensus voting, and prompt/test refactors 2026-01-16 14:24:37 +00:00
bec379e9ca feat(paddleocr): add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows 2026-01-16 13:23:01 +00:00
ae4bb26931 feat(paddleocr): add PaddleOCR support: Docker images, FastAPI server, entrypoint and tests 2026-01-16 10:23:32 +00:00
379b5c19eb feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing 2026-01-16 10:22:15 +00:00