# ModelGrid Project Hints ## Project Overview ModelGrid is a root-level daemon that manages GPU infrastructure, Docker, and AI model containers (Ollama, vLLM, TGI) with an OpenAI-compatible API interface. ## Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ ModelGrid Daemon │ ├─────────────────────────────────────────────────────────────────┤ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ CLI │ │ Hardware │ │ Container Manager │ │ │ │ Commands │ │ Detection │ │ (Docker/Podman) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ Driver │ │ Model │ │ OpenAI API Gateway │ │ │ │ Installer │ │ Registry │ │ (HTTP Server) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ ├─────────────────────────────────────────────────────────────────┤ │ Systemd Service │ └─────────────────────────────────────────────────────────────────┘ ``` ## File Organization ``` ts/ ├── index.ts # Node.js entry point ├── cli.ts # CLI router ├── modelgrid.ts # Main coordinator (facade) ├── daemon.ts # Background daemon ├── systemd.ts # Systemd integration ├── constants.ts # Configuration constants ├── logger.ts # Logging utilities ├── colors.ts # Color themes ├── interfaces/ # TypeScript interfaces │ ├── config.ts # IModelGridConfig │ ├── gpu.ts # IGpuInfo, IGpuStatus │ ├── container.ts # IContainerConfig, IContainerStatus │ └── api.ts # OpenAI API types ├── hardware/ # Hardware detection │ ├── gpu-detector.ts # Detect GPUs (NVIDIA, AMD, Intel) │ └── system-info.ts # CPU, RAM info ├── drivers/ # Driver management │ ├── nvidia.ts # NVIDIA driver + CUDA │ ├── amd.ts # AMD driver + ROCm │ ├── intel.ts # Intel Arc + oneAPI │ └── driver-manager.ts # Driver orchestrator ├── docker/ # Docker management │ ├── docker-manager.ts # Docker setup │ └── container-runtime.ts # Container lifecycle ├── containers/ # AI container management │ ├── ollama.ts # Ollama container │ ├── vllm.ts # vLLM container │ ├── tgi.ts # TGI container │ └── container-manager.ts # Orchestrator ├── models/ # Model management │ ├── registry.ts # Greenlit model registry │ └── loader.ts # Model loading with VRAM checks ├── api/ # OpenAI-compatible API │ ├── server.ts # HTTP server │ ├── router.ts # Request routing │ ├── handlers/ # API endpoint handlers │ │ ├── chat.ts # /v1/chat/completions │ │ ├── models.ts # /v1/models │ │ └── embeddings.ts # /v1/embeddings │ └── middleware/ # Request processing │ ├── auth.ts # API key validation │ └── sanity.ts # Request validation ├── cli/ # CLI handlers │ ├── service-handler.ts │ ├── gpu-handler.ts │ ├── container-handler.ts │ ├── model-handler.ts │ └── config-handler.ts └── helpers/ # Utilities ├── prompt.ts # Readline utility └── shortid.ts # ID generation ``` ## Key Concepts ### Greenlit Model System - Only pre-approved models can be auto-pulled for security - Greenlist fetched from remote URL (configurable) - VRAM requirements checked before loading ### Container Types - **Ollama**: Easy to use, native API converted to OpenAI format - **vLLM**: High performance, natively OpenAI-compatible - **TGI**: HuggingFace Text Generation Inference ### GPU Support - NVIDIA: nvidia-smi, CUDA, nvidia-docker2 - AMD: rocm-smi, ROCm - Intel Arc: xpu-smi, oneAPI ## Configuration Config file: `/etc/modelgrid/config.json` ```typescript interface IModelGridConfig { version: string; api: { port: number; // Default: 8080 host: string; // Default: '0.0.0.0' apiKeys: string[]; // Valid API keys cors: boolean; corsOrigins: string[]; }; docker: { networkName: string; // Default: 'modelgrid' runtime: 'docker' | 'podman'; }; gpus: { autoDetect: boolean; assignments: Record; }; containers: IContainerConfig[]; models: { greenlistUrl: string; autoPull: boolean; defaultContainer: string; autoLoad: string[]; }; checkInterval: number; } ``` ## CLI Commands ```bash modelgrid service enable/disable/start/stop/status/logs modelgrid gpu list/status/drivers/install modelgrid container list/add/remove/start/stop/logs modelgrid model list/pull/remove/status/refresh modelgrid config show/init/apikey ``` ## API Endpoints - `POST /v1/chat/completions` - Chat completion (OpenAI-compatible) - `GET /v1/models` - List available models - `POST /v1/embeddings` - Generate embeddings - `GET /health` - Health check - `GET /metrics` - Prometheus metrics ## Development Notes - All async patterns preferred for flexibility - Use `fs.promises` instead of sync methods - Containers auto-start on daemon startup - Models auto-preload if configured