initial
This commit is contained in:
296
readme.md
Normal file
296
readme.md
Normal file
@@ -0,0 +1,296 @@
|
||||
# ModelGrid
|
||||
|
||||
**GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.**
|
||||
|
||||
ModelGrid is a root-level daemon that manages GPU infrastructure, Docker containers, and AI model serving. It provides an OpenAI-compatible API interface for seamless integration with existing tools and applications.
|
||||
|
||||
## Features
|
||||
|
||||
- **Multi-GPU Support**: Detect and manage NVIDIA (CUDA), AMD (ROCm), and Intel Arc (oneAPI) GPUs
|
||||
- **Container Management**: Orchestrate Ollama, vLLM, and TGI containers with GPU passthrough
|
||||
- **OpenAI-Compatible API**: Drop-in replacement API for chat completions, embeddings, and model management
|
||||
- **Greenlit Models**: Controlled model auto-pulling with remote configuration
|
||||
- **Systemd Integration**: Run as a system service with automatic startup
|
||||
- **Cross-Platform**: Pre-compiled binaries for Linux, macOS, and Windows
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Via npm (recommended)
|
||||
npm install -g @modelgrid.com/modelgrid
|
||||
|
||||
# Via installer script
|
||||
curl -sSL https://code.foss.global/modelgrid.com/modelgrid/raw/branch/main/install.sh | sudo bash
|
||||
```
|
||||
|
||||
### Initial Setup
|
||||
|
||||
```bash
|
||||
# 1. Check GPU detection
|
||||
sudo modelgrid gpu list
|
||||
|
||||
# 2. Initialize configuration
|
||||
sudo modelgrid config init
|
||||
|
||||
# 3. Enable and start the service
|
||||
sudo modelgrid service enable
|
||||
sudo modelgrid service start
|
||||
|
||||
# 4. Check status
|
||||
modelgrid service status
|
||||
```
|
||||
|
||||
### Using the API
|
||||
|
||||
Once running, ModelGrid exposes an OpenAI-compatible API:
|
||||
|
||||
```bash
|
||||
# List available models
|
||||
curl http://localhost:8080/v1/models \
|
||||
-H "Authorization: Bearer YOUR_API_KEY"
|
||||
|
||||
# Chat completion
|
||||
curl http://localhost:8080/v1/chat/completions \
|
||||
-H "Authorization: Bearer YOUR_API_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "llama3:8b",
|
||||
"messages": [{"role": "user", "content": "Hello!"}]
|
||||
}'
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
### Service Management
|
||||
|
||||
```bash
|
||||
modelgrid service enable # Install and enable systemd service
|
||||
modelgrid service disable # Stop and disable systemd service
|
||||
modelgrid service start # Start the service
|
||||
modelgrid service stop # Stop the service
|
||||
modelgrid service status # Show service status
|
||||
modelgrid service logs # Show service logs
|
||||
```
|
||||
|
||||
### GPU Management
|
||||
|
||||
```bash
|
||||
modelgrid gpu list # List detected GPUs
|
||||
modelgrid gpu status # Show GPU utilization
|
||||
modelgrid gpu drivers # Check/install GPU drivers
|
||||
```
|
||||
|
||||
### Container Management
|
||||
|
||||
```bash
|
||||
modelgrid container add # Add a new container
|
||||
modelgrid container remove # Remove a container
|
||||
modelgrid container list # List all containers
|
||||
modelgrid container start # Start a container
|
||||
modelgrid container stop # Stop a container
|
||||
```
|
||||
|
||||
### Model Management
|
||||
|
||||
```bash
|
||||
modelgrid model list # List available/loaded models
|
||||
modelgrid model pull <name> # Pull a model
|
||||
modelgrid model remove <name> # Remove a model
|
||||
```
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
modelgrid config show # Display current configuration
|
||||
modelgrid config init # Initialize configuration
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration is stored at `/etc/modelgrid/config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0",
|
||||
"api": {
|
||||
"port": 8080,
|
||||
"host": "0.0.0.0",
|
||||
"apiKeys": ["your-api-key-here"]
|
||||
},
|
||||
"docker": {
|
||||
"networkName": "modelgrid",
|
||||
"runtime": "docker"
|
||||
},
|
||||
"gpus": {
|
||||
"autoDetect": true,
|
||||
"assignments": {}
|
||||
},
|
||||
"containers": [],
|
||||
"models": {
|
||||
"greenlistUrl": "https://code.foss.global/modelgrid.com/model_lists/raw/branch/main/greenlit.json",
|
||||
"autoPull": true,
|
||||
"defaultContainer": "ollama",
|
||||
"autoLoad": []
|
||||
},
|
||||
"checkInterval": 30000
|
||||
}
|
||||
```
|
||||
|
||||
## Supported Container Types
|
||||
|
||||
### Ollama
|
||||
|
||||
Best for general-purpose model serving with easy model management.
|
||||
|
||||
```bash
|
||||
modelgrid container add --type ollama --gpu gpu-0
|
||||
```
|
||||
|
||||
### vLLM
|
||||
|
||||
High-performance serving for large models with tensor parallelism.
|
||||
|
||||
```bash
|
||||
modelgrid container add --type vllm --gpu gpu-0,gpu-1
|
||||
```
|
||||
|
||||
### TGI (Text Generation Inference)
|
||||
|
||||
HuggingFace's production-ready inference server.
|
||||
|
||||
```bash
|
||||
modelgrid container add --type tgi --gpu gpu-0
|
||||
```
|
||||
|
||||
## GPU Support
|
||||
|
||||
### NVIDIA (CUDA)
|
||||
|
||||
Requires NVIDIA drivers and NVIDIA Container Toolkit:
|
||||
|
||||
```bash
|
||||
# Check driver status
|
||||
modelgrid gpu drivers
|
||||
|
||||
# Install if needed (Ubuntu/Debian)
|
||||
sudo apt install nvidia-driver-535 nvidia-container-toolkit
|
||||
```
|
||||
|
||||
### AMD (ROCm)
|
||||
|
||||
Requires ROCm drivers:
|
||||
|
||||
```bash
|
||||
# Check driver status
|
||||
modelgrid gpu drivers
|
||||
```
|
||||
|
||||
### Intel Arc (oneAPI)
|
||||
|
||||
Requires Intel GPU drivers and oneAPI toolkit:
|
||||
|
||||
```bash
|
||||
# Check driver status
|
||||
modelgrid gpu drivers
|
||||
```
|
||||
|
||||
## Greenlit Models
|
||||
|
||||
ModelGrid uses a greenlit model system to control which models can be auto-pulled. The greenlist is fetched from a configurable URL and contains approved models with VRAM requirements:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0",
|
||||
"models": [
|
||||
{ "name": "llama3:8b", "container": "ollama", "minVram": 8 },
|
||||
{ "name": "mistral:7b", "container": "ollama", "minVram": 8 },
|
||||
{ "name": "llama3:70b", "container": "vllm", "minVram": 48 }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
When a request comes in for a model not currently loaded:
|
||||
1. Check if model is in the greenlist
|
||||
2. Verify VRAM requirements can be met
|
||||
3. Auto-pull and load the model
|
||||
4. Serve the request
|
||||
|
||||
## API Reference
|
||||
|
||||
### Chat Completions
|
||||
|
||||
```
|
||||
POST /v1/chat/completions
|
||||
```
|
||||
|
||||
OpenAI-compatible chat completion endpoint with streaming support.
|
||||
|
||||
### Models
|
||||
|
||||
```
|
||||
GET /v1/models
|
||||
GET /v1/models/:model
|
||||
```
|
||||
|
||||
List available models or get details for a specific model.
|
||||
|
||||
### Embeddings
|
||||
|
||||
```
|
||||
POST /v1/embeddings
|
||||
```
|
||||
|
||||
Generate text embeddings using compatible models.
|
||||
|
||||
## Development
|
||||
|
||||
### Building from Source
|
||||
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone https://code.foss.global/modelgrid.com/modelgrid.git
|
||||
cd modelgrid
|
||||
|
||||
# Run directly with Deno
|
||||
deno run --allow-all mod.ts help
|
||||
|
||||
# Compile for current platform
|
||||
deno compile --allow-all --output modelgrid mod.ts
|
||||
|
||||
# Compile for all platforms
|
||||
bash scripts/compile-all.sh
|
||||
```
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
modelgrid/
|
||||
├── mod.ts # Entry point
|
||||
├── ts/
|
||||
│ ├── cli.ts # CLI command routing
|
||||
│ ├── modelgrid.ts # Main coordinator class
|
||||
│ ├── daemon.ts # Background daemon
|
||||
│ ├── systemd.ts # Systemd service management
|
||||
│ ├── constants.ts # Configuration constants
|
||||
│ ├── interfaces/ # TypeScript interfaces
|
||||
│ ├── hardware/ # GPU detection
|
||||
│ ├── drivers/ # Driver management
|
||||
│ ├── docker/ # Docker management
|
||||
│ ├── containers/ # Container orchestration
|
||||
│ ├── api/ # OpenAI-compatible API
|
||||
│ ├── models/ # Model management
|
||||
│ └── cli/ # CLI handlers
|
||||
├── test/ # Test files
|
||||
└── scripts/ # Build scripts
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT License - See [license](./license) for details.
|
||||
|
||||
## Links
|
||||
|
||||
- Repository: https://code.foss.global/modelgrid.com/modelgrid
|
||||
- Issues: https://community.foss.global/
|
||||
Reference in New Issue
Block a user