Go to file

Juergen Kunz 379b5c19eb feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

assets

update

2026-01-16 03:58:39 +00:00

image_support_files

feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

recipes

update

2026-01-16 03:58:39 +00:00

test

feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

.gitignore

initial

2026-01-16 01:51:57 +00:00

build-images.sh

initial

2026-01-16 01:51:57 +00:00

changelog.md

feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

Dockerfile_minicpm45v

initial

2026-01-16 01:51:57 +00:00

Dockerfile_minicpm45v_cpu

initial

2026-01-16 01:51:57 +00:00

Dockerfile_paddleocr

feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

Dockerfile_paddleocr_cpu

feat(ocr): add PaddleOCR GPU Docker image and FastAPI OCR server with entrypoint; implement OCR endpoints and consensus extraction testing

2026-01-16 10:22:15 +00:00

npmextra.json

initial

2026-01-16 01:51:57 +00:00

package.json

update

2026-01-16 03:58:39 +00:00

pnpm-lock.yaml

update

2026-01-16 03:58:39 +00:00

readme.hints.md

initial

2026-01-16 01:51:57 +00:00

readme.md

initial

2026-01-16 01:51:57 +00:00

test-images.sh

initial

2026-01-16 01:51:57 +00:00

tsconfig.json

update

2026-01-16 03:58:39 +00:00

readme.md

@host.today/ht-docker-ai

Docker images for AI vision-language models, starting with MiniCPM-V 4.5.

Overview

This project provides ready-to-use Docker containers for running state-of-the-art AI vision-language models. Built on Ollama for simplified model management and a consistent REST API.

Available Images

Tag	Description	Requirements
`minicpm45v`	MiniCPM-V 4.5 with GPU support	NVIDIA GPU, 9-18GB VRAM
`minicpm45v-cpu`	MiniCPM-V 4.5 CPU-only	8GB+ RAM
`latest`	Alias for `minicpm45v`	NVIDIA GPU

Quick Start

GPU (Recommended)

docker run -d \
  --name minicpm \
  --gpus all \
  -p 11434:11434 \
  -v ollama-data:/root/.ollama \
  code.foss.global/host.today/ht-docker-ai:minicpm45v

CPU Only

docker run -d \
  --name minicpm \
  -p 11434:11434 \
  -v ollama-data:/root/.ollama \
  code.foss.global/host.today/ht-docker-ai:minicpm45v-cpu

API Usage

The container exposes the Ollama API on port 11434.

List Available Models

curl http://localhost:11434/api/tags

Generate Text from Image

curl http://localhost:11434/api/generate -d '{
  "model": "minicpm-v",
  "prompt": "What do you see in this image?",
  "images": ["<base64-encoded-image>"]
}'

Chat with Vision

curl http://localhost:11434/api/chat -d '{
  "model": "minicpm-v",
  "messages": [
    {
      "role": "user",
      "content": "Describe this image in detail",
      "images": ["<base64-encoded-image>"]
    }
  ]
}'

Environment Variables

Variable	Default	Description
`MODEL_NAME`	`minicpm-v`	Model to pull on startup
`OLLAMA_HOST`	`0.0.0.0`	Host address for API
`OLLAMA_ORIGINS`	`*`	Allowed CORS origins

Hardware Requirements

GPU Variant (`minicpm45v`)

NVIDIA GPU with CUDA support
Minimum 9GB VRAM (int4 quantized)
Recommended 18GB VRAM (full precision)
NVIDIA Container Toolkit installed

CPU Variant (`minicpm45v-cpu`)

Minimum 8GB RAM
Recommended 16GB+ RAM for better performance
No GPU required

Model Information

MiniCPM-V 4.5 is a GPT-4o level multimodal large language model developed by OpenBMB.

Parameters: 8B (Qwen3-8B + SigLIP2-400M)
Capabilities: Image understanding, OCR, multi-image analysis
Languages: 30+ languages including English, Chinese, French, Spanish

Docker Compose Example

version: '3.8'
services:
  minicpm:
    image: code.foss.global/host.today/ht-docker-ai:minicpm45v
    container_name: minicpm
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

volumes:
  ollama-data:

Building Locally

# Clone the repository
git clone https://code.foss.global/host.today/ht-docker-ai.git
cd ht-docker-ai

# Build all images
./build-images.sh

# Run tests
./test-images.sh

License

MIT - Task Venture Capital GmbH

readme.md

@host.today/ht-docker-ai

Overview

Available Images

Quick Start

GPU (Recommended)

CPU Only

API Usage

List Available Models

Generate Text from Image

Chat with Vision

Environment Variables

Hardware Requirements

GPU Variant (minicpm45v)

CPU Variant (minicpm45v-cpu)

Model Information

Docker Compose Example

Building Locally

License

GPU Variant (`minicpm45v`)

CPU Variant (`minicpm45v-cpu`)