# Changelog ## 2026-01-16 - 1.4.0 - feat(invoices) add hybrid OCR + vision invoice/document parsing with PaddleOCR, consensus voting, and prompt/test refactors - Add hybrid pipeline documentation and examples (PaddleOCR + MiniCPM-V) and architecture diagram in recipes/document.md - Integrate PaddleOCR: new OCR extraction functions and OCR-only prompt flow in test/test.node.ts - Add consensus voting and parallel-pass optimization to improve reliability (multiple passes, hashing, and majority voting) - Refactor prompts and tests: introduce /nothink token, OCR truncation limits, separate visual and OCR-only prompts, and improved prompt building in test/test.invoices.ts - Update image conversion defaults (200 DPI, filename change) and add TypeScript helper functions for extraction and consensus handling ## 2026-01-16 - 1.3.0 - feat(paddleocr) add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows - Add GPU and CPU PaddleOCR Dockerfiles; pin paddlepaddle/paddle and paddleocr to stable 2.x and install libgomp1 for CPU builds - Avoid pre-downloading OCR models at build-time to prevent build-time segfaults; models are downloaded on first run - Refactor PaddleOCR FastAPI server: respect CUDA_VISIBLE_DEVICES, support per-request language, cache default language instance and create temporary instances for other languages - Add comprehensive tests (test.paddleocr.ts) and improve invoice extraction tests (parallelize passes, JSON OCR API usage, prioritize certain test cases) - Add Gitea CI workflows for tag and non-tag Docker runs and release pipeline (docker build/push, metadata trigger) - Update documentation (readme.hints.md) with PaddleOCR usage and add docker registry entry to npmextra.json ## 2026-01-16 - 1.2.0 - feat(paddleocr) add PaddleOCR support: Docker images, FastAPI server, entrypoint and tests - Add PaddleOCR FastAPI server implementation at image_support_files/paddleocr_server.py - Remove old image_support_files/paddleocr-server.py and update entrypoint to import paddleocr_server:app - Extend build-images.sh to build paddleocr (GPU) and paddleocr-cpu images and list them - Extend test-images.sh to add paddleocr health/OCR tests, new test_paddleocr_image function, port config, and cleanup; rename test_image -> test_minicpm_image ## 2026-01-16 - 1.1.0 - feat(ocr) add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing - Add Dockerfile_paddleocr for GPU-accelerated PaddleOCR image (pre-downloads PP-OCRv4 models, exposes port 5000, healthcheck, entrypoint) - Add image_support_files/paddleocr-server.py: FastAPI app providing /ocr (base64), /ocr/upload (file), and /health endpoints; model warm-up on startup; structured JSON responses and error handling - Add image_support_files/paddleocr-entrypoint.sh to configure environment, detect GPU/CPU mode, and launch uvicorn - Update test/test.node.ts to replace streaming extraction with a consensus-based extraction flow (multiple passes, hashing of results, majority voting) and improve logging/prompt text - Add test/test.invoices.ts: integration tests for invoice extraction that call PaddleOCR, build prompts with optional OCR text, run consensus extraction, and produce a summary report ## 2026-01-16 - 1.0.0 - initial release Initial project files added with two small follow-up updates. - initial: base project commit. - update: two minor follow-up updates refining the initial commit.