feat(paddleocr): add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows

This commit is contained in:
2026-01-16 13:23:01 +00:00
parent 67c38eeb67
commit bec379e9ca
10 changed files with 624 additions and 71 deletions

View File

@@ -1,5 +1,15 @@
# Changelog
## 2026-01-16 - 1.3.0 - feat(paddleocr)
add PaddleOCR OCR service (Docker images, server, tests, docs) and CI workflows
- Add GPU and CPU PaddleOCR Dockerfiles; pin paddlepaddle/paddle and paddleocr to stable 2.x and install libgomp1 for CPU builds
- Avoid pre-downloading OCR models at build-time to prevent build-time segfaults; models are downloaded on first run
- Refactor PaddleOCR FastAPI server: respect CUDA_VISIBLE_DEVICES, support per-request language, cache default language instance and create temporary instances for other languages
- Add comprehensive tests (test.paddleocr.ts) and improve invoice extraction tests (parallelize passes, JSON OCR API usage, prioritize certain test cases)
- Add Gitea CI workflows for tag and non-tag Docker runs and release pipeline (docker build/push, metadata trigger)
- Update documentation (readme.hints.md) with PaddleOCR usage and add docker registry entry to npmextra.json
## 2026-01-16 - 1.2.0 - feat(paddleocr)
add PaddleOCR support: Docker images, FastAPI server, entrypoint and tests