3dc1881d8b8110d9d76430e4bff76b856a5d79e4
@host.today/ht-docker-ai
Docker images for AI vision-language models, starting with MiniCPM-V 4.5.
Overview
This project provides ready-to-use Docker containers for running state-of-the-art AI vision-language models. Built on Ollama for simplified model management and a consistent REST API.
Available Images
| Tag | Description | Requirements |
|---|---|---|
minicpm45v |
MiniCPM-V 4.5 with GPU support | NVIDIA GPU, 9-18GB VRAM |
minicpm45v-cpu |
MiniCPM-V 4.5 CPU-only | 8GB+ RAM |
latest |
Alias for minicpm45v |
NVIDIA GPU |
Quick Start
GPU (Recommended)
docker run -d \
--name minicpm \
--gpus all \
-p 11434:11434 \
-v ollama-data:/root/.ollama \
code.foss.global/host.today/ht-docker-ai:minicpm45v
CPU Only
docker run -d \
--name minicpm \
-p 11434:11434 \
-v ollama-data:/root/.ollama \
code.foss.global/host.today/ht-docker-ai:minicpm45v-cpu
API Usage
The container exposes the Ollama API on port 11434.
List Available Models
curl http://localhost:11434/api/tags
Generate Text from Image
curl http://localhost:11434/api/generate -d '{
"model": "minicpm-v",
"prompt": "What do you see in this image?",
"images": ["<base64-encoded-image>"]
}'
Chat with Vision
curl http://localhost:11434/api/chat -d '{
"model": "minicpm-v",
"messages": [
{
"role": "user",
"content": "Describe this image in detail",
"images": ["<base64-encoded-image>"]
}
]
}'
Environment Variables
| Variable | Default | Description |
|---|---|---|
MODEL_NAME |
minicpm-v |
Model to pull on startup |
OLLAMA_HOST |
0.0.0.0 |
Host address for API |
OLLAMA_ORIGINS |
* |
Allowed CORS origins |
Hardware Requirements
GPU Variant (minicpm45v)
- NVIDIA GPU with CUDA support
- Minimum 9GB VRAM (int4 quantized)
- Recommended 18GB VRAM (full precision)
- NVIDIA Container Toolkit installed
CPU Variant (minicpm45v-cpu)
- Minimum 8GB RAM
- Recommended 16GB+ RAM for better performance
- No GPU required
Model Information
MiniCPM-V 4.5 is a GPT-4o level multimodal large language model developed by OpenBMB.
- Parameters: 8B (Qwen3-8B + SigLIP2-400M)
- Capabilities: Image understanding, OCR, multi-image analysis
- Languages: 30+ languages including English, Chinese, French, Spanish
Docker Compose Example
version: '3.8'
services:
minicpm:
image: code.foss.global/host.today/ht-docker-ai:minicpm45v
container_name: minicpm
ports:
- "11434:11434"
volumes:
- ollama-data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
volumes:
ollama-data:
Building Locally
# Clone the repository
git clone https://code.foss.global/host.today/ht-docker-ai.git
cd ht-docker-ai
# Build all images
./build-images.sh
# Run tests
./test-images.sh
License
MIT - Task Venture Capital GmbH
Description
Languages
TypeScript
94.8%
Shell
5.2%