# ModelGrid **GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.** ModelGrid is a root-level daemon that manages GPU infrastructure, Docker containers, and AI model serving. It provides an OpenAI-compatible API interface for seamless integration with existing tools and applications. ## Features - **Multi-GPU Support**: Detect and manage NVIDIA (CUDA), AMD (ROCm), and Intel Arc (oneAPI) GPUs - **Container Management**: Orchestrate Ollama, vLLM, and TGI containers with GPU passthrough - **OpenAI-Compatible API**: Drop-in replacement API for chat completions, embeddings, and model management - **Greenlit Models**: Controlled model auto-pulling with remote configuration - **Systemd Integration**: Run as a system service with automatic startup - **Cross-Platform**: Pre-compiled binaries for Linux, macOS, and Windows ## Quick Start ### Installation ```bash # Via npm (recommended) npm install -g @modelgrid.com/modelgrid # Via installer script curl -sSL https://code.foss.global/modelgrid.com/modelgrid/raw/branch/main/install.sh | sudo bash ``` ### Initial Setup ```bash # 1. Check GPU detection sudo modelgrid gpu list # 2. Initialize configuration sudo modelgrid config init # 3. Enable and start the service sudo modelgrid service enable sudo modelgrid service start # 4. Check status modelgrid service status ``` ### Using the API Once running, ModelGrid exposes an OpenAI-compatible API: ```bash # List available models curl http://localhost:8080/v1/models \ -H "Authorization: Bearer YOUR_API_KEY" # Chat completion curl http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llama3:8b", "messages": [{"role": "user", "content": "Hello!"}] }' ``` ## CLI Commands ### Service Management ```bash modelgrid service enable # Install and enable systemd service modelgrid service disable # Stop and disable systemd service modelgrid service start # Start the service modelgrid service stop # Stop the service modelgrid service status # Show service status modelgrid service logs # Show service logs ``` ### GPU Management ```bash modelgrid gpu list # List detected GPUs modelgrid gpu status # Show GPU utilization modelgrid gpu drivers # Check/install GPU drivers ``` ### Container Management ```bash modelgrid container add # Add a new container modelgrid container remove # Remove a container modelgrid container list # List all containers modelgrid container start # Start a container modelgrid container stop # Stop a container ``` ### Model Management ```bash modelgrid model list # List available/loaded models modelgrid model pull # Pull a model modelgrid model remove # Remove a model ``` ### Configuration ```bash modelgrid config show # Display current configuration modelgrid config init # Initialize configuration ``` ## Configuration Configuration is stored at `/etc/modelgrid/config.json`: ```json { "version": "1.0", "api": { "port": 8080, "host": "0.0.0.0", "apiKeys": ["your-api-key-here"] }, "docker": { "networkName": "modelgrid", "runtime": "docker" }, "gpus": { "autoDetect": true, "assignments": {} }, "containers": [], "models": { "greenlistUrl": "https://code.foss.global/modelgrid.com/model_lists/raw/branch/main/greenlit.json", "autoPull": true, "defaultContainer": "ollama", "autoLoad": [] }, "checkInterval": 30000 } ``` ## Supported Container Types ### Ollama Best for general-purpose model serving with easy model management. ```bash modelgrid container add --type ollama --gpu gpu-0 ``` ### vLLM High-performance serving for large models with tensor parallelism. ```bash modelgrid container add --type vllm --gpu gpu-0,gpu-1 ``` ### TGI (Text Generation Inference) HuggingFace's production-ready inference server. ```bash modelgrid container add --type tgi --gpu gpu-0 ``` ## GPU Support ### NVIDIA (CUDA) Requires NVIDIA drivers and NVIDIA Container Toolkit: ```bash # Check driver status modelgrid gpu drivers # Install if needed (Ubuntu/Debian) sudo apt install nvidia-driver-535 nvidia-container-toolkit ``` ### AMD (ROCm) Requires ROCm drivers: ```bash # Check driver status modelgrid gpu drivers ``` ### Intel Arc (oneAPI) Requires Intel GPU drivers and oneAPI toolkit: ```bash # Check driver status modelgrid gpu drivers ``` ## Greenlit Models ModelGrid uses a greenlit model system to control which models can be auto-pulled. The greenlist is fetched from a configurable URL and contains approved models with VRAM requirements: ```json { "version": "1.0", "models": [ { "name": "llama3:8b", "container": "ollama", "minVram": 8 }, { "name": "mistral:7b", "container": "ollama", "minVram": 8 }, { "name": "llama3:70b", "container": "vllm", "minVram": 48 } ] } ``` When a request comes in for a model not currently loaded: 1. Check if model is in the greenlist 2. Verify VRAM requirements can be met 3. Auto-pull and load the model 4. Serve the request ## API Reference ### Chat Completions ``` POST /v1/chat/completions ``` OpenAI-compatible chat completion endpoint with streaming support. ### Models ``` GET /v1/models GET /v1/models/:model ``` List available models or get details for a specific model. ### Embeddings ``` POST /v1/embeddings ``` Generate text embeddings using compatible models. ## Development ### Building from Source ```bash # Clone repository git clone https://code.foss.global/modelgrid.com/modelgrid.git cd modelgrid # Run directly with Deno deno run --allow-all mod.ts help # Compile for current platform deno compile --allow-all --output modelgrid mod.ts # Compile for all platforms bash scripts/compile-all.sh ``` ### Project Structure ``` modelgrid/ ├── mod.ts # Entry point ├── ts/ │ ├── cli.ts # CLI command routing │ ├── modelgrid.ts # Main coordinator class │ ├── daemon.ts # Background daemon │ ├── systemd.ts # Systemd service management │ ├── constants.ts # Configuration constants │ ├── interfaces/ # TypeScript interfaces │ ├── hardware/ # GPU detection │ ├── drivers/ # Driver management │ ├── docker/ # Docker management │ ├── containers/ # Container orchestration │ ├── api/ # OpenAI-compatible API │ ├── models/ # Model management │ └── cli/ # CLI handlers ├── test/ # Test files └── scripts/ # Build scripts ``` ## License MIT License - See [license](./license) for details. ## Links - Repository: https://code.foss.global/modelgrid.com/modelgrid - Issues: https://community.foss.global/