148 lines
3.1 KiB
Markdown
148 lines
3.1 KiB
Markdown
# @host.today/ht-docker-ai
|
|
|
|
Docker images for AI vision-language models, starting with MiniCPM-V 4.5.
|
|
|
|
## Overview
|
|
|
|
This project provides ready-to-use Docker containers for running state-of-the-art AI vision-language models. Built on Ollama for simplified model management and a consistent REST API.
|
|
|
|
## Available Images
|
|
|
|
| Tag | Description | Requirements |
|
|
|-----|-------------|--------------|
|
|
| `minicpm45v` | MiniCPM-V 4.5 with GPU support | NVIDIA GPU, 9-18GB VRAM |
|
|
| `minicpm45v-cpu` | MiniCPM-V 4.5 CPU-only | 8GB+ RAM |
|
|
| `latest` | Alias for `minicpm45v` | NVIDIA GPU |
|
|
|
|
## Quick Start
|
|
|
|
### GPU (Recommended)
|
|
|
|
```bash
|
|
docker run -d \
|
|
--name minicpm \
|
|
--gpus all \
|
|
-p 11434:11434 \
|
|
-v ollama-data:/root/.ollama \
|
|
code.foss.global/host.today/ht-docker-ai:minicpm45v
|
|
```
|
|
|
|
### CPU Only
|
|
|
|
```bash
|
|
docker run -d \
|
|
--name minicpm \
|
|
-p 11434:11434 \
|
|
-v ollama-data:/root/.ollama \
|
|
code.foss.global/host.today/ht-docker-ai:minicpm45v-cpu
|
|
```
|
|
|
|
## API Usage
|
|
|
|
The container exposes the Ollama API on port 11434.
|
|
|
|
### List Available Models
|
|
|
|
```bash
|
|
curl http://localhost:11434/api/tags
|
|
```
|
|
|
|
### Generate Text from Image
|
|
|
|
```bash
|
|
curl http://localhost:11434/api/generate -d '{
|
|
"model": "minicpm-v",
|
|
"prompt": "What do you see in this image?",
|
|
"images": ["<base64-encoded-image>"]
|
|
}'
|
|
```
|
|
|
|
### Chat with Vision
|
|
|
|
```bash
|
|
curl http://localhost:11434/api/chat -d '{
|
|
"model": "minicpm-v",
|
|
"messages": [
|
|
{
|
|
"role": "user",
|
|
"content": "Describe this image in detail",
|
|
"images": ["<base64-encoded-image>"]
|
|
}
|
|
]
|
|
}'
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `MODEL_NAME` | `minicpm-v` | Model to pull on startup |
|
|
| `OLLAMA_HOST` | `0.0.0.0` | Host address for API |
|
|
| `OLLAMA_ORIGINS` | `*` | Allowed CORS origins |
|
|
|
|
## Hardware Requirements
|
|
|
|
### GPU Variant (`minicpm45v`)
|
|
|
|
- NVIDIA GPU with CUDA support
|
|
- Minimum 9GB VRAM (int4 quantized)
|
|
- Recommended 18GB VRAM (full precision)
|
|
- NVIDIA Container Toolkit installed
|
|
|
|
### CPU Variant (`minicpm45v-cpu`)
|
|
|
|
- Minimum 8GB RAM
|
|
- Recommended 16GB+ RAM for better performance
|
|
- No GPU required
|
|
|
|
## Model Information
|
|
|
|
**MiniCPM-V 4.5** is a GPT-4o level multimodal large language model developed by OpenBMB.
|
|
|
|
- **Parameters**: 8B (Qwen3-8B + SigLIP2-400M)
|
|
- **Capabilities**: Image understanding, OCR, multi-image analysis
|
|
- **Languages**: 30+ languages including English, Chinese, French, Spanish
|
|
|
|
## Docker Compose Example
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
services:
|
|
minicpm:
|
|
image: code.foss.global/host.today/ht-docker-ai:minicpm45v
|
|
container_name: minicpm
|
|
ports:
|
|
- "11434:11434"
|
|
volumes:
|
|
- ollama-data:/root/.ollama
|
|
deploy:
|
|
resources:
|
|
reservations:
|
|
devices:
|
|
- driver: nvidia
|
|
count: 1
|
|
capabilities: [gpu]
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
ollama-data:
|
|
```
|
|
|
|
## Building Locally
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://code.foss.global/host.today/ht-docker-ai.git
|
|
cd ht-docker-ai
|
|
|
|
# Build all images
|
|
./build-images.sh
|
|
|
|
# Run tests
|
|
./test-images.sh
|
|
```
|
|
|
|
## License
|
|
|
|
MIT - Task Venture Capital GmbH
|