Files
ht-docker-ai/readme.md
2026-01-16 01:51:57 +00:00

148 lines
3.1 KiB
Markdown

# @host.today/ht-docker-ai
Docker images for AI vision-language models, starting with MiniCPM-V 4.5.
## Overview
This project provides ready-to-use Docker containers for running state-of-the-art AI vision-language models. Built on Ollama for simplified model management and a consistent REST API.
## Available Images
| Tag | Description | Requirements |
|-----|-------------|--------------|
| `minicpm45v` | MiniCPM-V 4.5 with GPU support | NVIDIA GPU, 9-18GB VRAM |
| `minicpm45v-cpu` | MiniCPM-V 4.5 CPU-only | 8GB+ RAM |
| `latest` | Alias for `minicpm45v` | NVIDIA GPU |
## Quick Start
### GPU (Recommended)
```bash
docker run -d \
--name minicpm \
--gpus all \
-p 11434:11434 \
-v ollama-data:/root/.ollama \
code.foss.global/host.today/ht-docker-ai:minicpm45v
```
### CPU Only
```bash
docker run -d \
--name minicpm \
-p 11434:11434 \
-v ollama-data:/root/.ollama \
code.foss.global/host.today/ht-docker-ai:minicpm45v-cpu
```
## API Usage
The container exposes the Ollama API on port 11434.
### List Available Models
```bash
curl http://localhost:11434/api/tags
```
### Generate Text from Image
```bash
curl http://localhost:11434/api/generate -d '{
"model": "minicpm-v",
"prompt": "What do you see in this image?",
"images": ["<base64-encoded-image>"]
}'
```
### Chat with Vision
```bash
curl http://localhost:11434/api/chat -d '{
"model": "minicpm-v",
"messages": [
{
"role": "user",
"content": "Describe this image in detail",
"images": ["<base64-encoded-image>"]
}
]
}'
```
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MODEL_NAME` | `minicpm-v` | Model to pull on startup |
| `OLLAMA_HOST` | `0.0.0.0` | Host address for API |
| `OLLAMA_ORIGINS` | `*` | Allowed CORS origins |
## Hardware Requirements
### GPU Variant (`minicpm45v`)
- NVIDIA GPU with CUDA support
- Minimum 9GB VRAM (int4 quantized)
- Recommended 18GB VRAM (full precision)
- NVIDIA Container Toolkit installed
### CPU Variant (`minicpm45v-cpu`)
- Minimum 8GB RAM
- Recommended 16GB+ RAM for better performance
- No GPU required
## Model Information
**MiniCPM-V 4.5** is a GPT-4o level multimodal large language model developed by OpenBMB.
- **Parameters**: 8B (Qwen3-8B + SigLIP2-400M)
- **Capabilities**: Image understanding, OCR, multi-image analysis
- **Languages**: 30+ languages including English, Chinese, French, Spanish
## Docker Compose Example
```yaml
version: '3.8'
services:
minicpm:
image: code.foss.global/host.today/ht-docker-ai:minicpm45v
container_name: minicpm
ports:
- "11434:11434"
volumes:
- ollama-data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
volumes:
ollama-data:
```
## Building Locally
```bash
# Clone the repository
git clone https://code.foss.global/host.today/ht-docker-ai.git
cd ht-docker-ai
# Build all images
./build-images.sh
# Run tests
./test-images.sh
```
## License
MIT - Task Venture Capital GmbH