initial

2026-01-30 03:16:57 +00:00
commit daaf6559e3
80 changed files with 14430 additions and 0 deletions
--- a/readme.md
+++ b/readme.md
@@ -0,0 +1,296 @@
+# ModelGrid
+
+**GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.**
+
+ModelGrid is a root-level daemon that manages GPU infrastructure, Docker containers, and AI model serving. It provides an OpenAI-compatible API interface for seamless integration with existing tools and applications.
+
+## Features
+
+- **Multi-GPU Support**: Detect and manage NVIDIA (CUDA), AMD (ROCm), and Intel Arc (oneAPI) GPUs
+- **Container Management**: Orchestrate Ollama, vLLM, and TGI containers with GPU passthrough
+- **OpenAI-Compatible API**: Drop-in replacement API for chat completions, embeddings, and model management
+- **Greenlit Models**: Controlled model auto-pulling with remote configuration
+- **Systemd Integration**: Run as a system service with automatic startup
+- **Cross-Platform**: Pre-compiled binaries for Linux, macOS, and Windows
+
+## Quick Start
+
+### Installation
+
+```bash
+# Via npm (recommended)
+npm install -g @modelgrid.com/modelgrid
+
+# Via installer script
+curl -sSL https://code.foss.global/modelgrid.com/modelgrid/raw/branch/main/install.sh | sudo bash
+```
+
+### Initial Setup
+
+```bash
+# 1. Check GPU detection
+sudo modelgrid gpu list
+
+# 2. Initialize configuration
+sudo modelgrid config init
+
+# 3. Enable and start the service
+sudo modelgrid service enable
+sudo modelgrid service start
+
+# 4. Check status
+modelgrid service status
+```
+
+### Using the API
+
+Once running, ModelGrid exposes an OpenAI-compatible API:
+
+```bash
+# List available models
+curl http://localhost:8080/v1/models \
+  -H "Authorization: Bearer YOUR_API_KEY"
+
+# Chat completion
+curl http://localhost:8080/v1/chat/completions \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "llama3:8b",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
+## CLI Commands
+
+### Service Management
+
+```bash
+modelgrid service enable      # Install and enable systemd service
+modelgrid service disable     # Stop and disable systemd service
+modelgrid service start       # Start the service
+modelgrid service stop        # Stop the service
+modelgrid service status      # Show service status
+modelgrid service logs        # Show service logs
+```
+
+### GPU Management
+
+```bash
+modelgrid gpu list            # List detected GPUs
+modelgrid gpu status          # Show GPU utilization
+modelgrid gpu drivers         # Check/install GPU drivers
+```
+
+### Container Management
+
+```bash
+modelgrid container add       # Add a new container
+modelgrid container remove    # Remove a container
+modelgrid container list      # List all containers
+modelgrid container start     # Start a container
+modelgrid container stop      # Stop a container
+```
+
+### Model Management
+
+```bash
+modelgrid model list          # List available/loaded models
+modelgrid model pull <name>   # Pull a model
+modelgrid model remove <name> # Remove a model
+```
+
+### Configuration
+
+```bash
+modelgrid config show         # Display current configuration
+modelgrid config init         # Initialize configuration
+```
+
+## Configuration
+
+Configuration is stored at `/etc/modelgrid/config.json`:
+
+```json
+{
+  "version": "1.0",
+  "api": {
+    "port": 8080,
+    "host": "0.0.0.0",
+    "apiKeys": ["your-api-key-here"]
+  },
+  "docker": {
+    "networkName": "modelgrid",
+    "runtime": "docker"
+  },
+  "gpus": {
+    "autoDetect": true,
+    "assignments": {}
+  },
+  "containers": [],
+  "models": {
+    "greenlistUrl": "https://code.foss.global/modelgrid.com/model_lists/raw/branch/main/greenlit.json",
+    "autoPull": true,
+    "defaultContainer": "ollama",
+    "autoLoad": []
+  },
+  "checkInterval": 30000
+}
+```
+
+## Supported Container Types
+
+### Ollama
+
+Best for general-purpose model serving with easy model management.
+
+```bash
+modelgrid container add --type ollama --gpu gpu-0
+```
+
+### vLLM
+
+High-performance serving for large models with tensor parallelism.
+
+```bash
+modelgrid container add --type vllm --gpu gpu-0,gpu-1
+```
+
+### TGI (Text Generation Inference)
+
+HuggingFace's production-ready inference server.
+
+```bash
+modelgrid container add --type tgi --gpu gpu-0
+```
+
+## GPU Support
+
+### NVIDIA (CUDA)
+
+Requires NVIDIA drivers and NVIDIA Container Toolkit:
+
+```bash
+# Check driver status
+modelgrid gpu drivers
+
+# Install if needed (Ubuntu/Debian)
+sudo apt install nvidia-driver-535 nvidia-container-toolkit
+```
+
+### AMD (ROCm)
+
+Requires ROCm drivers:
+
+```bash
+# Check driver status
+modelgrid gpu drivers
+```
+
+### Intel Arc (oneAPI)
+
+Requires Intel GPU drivers and oneAPI toolkit:
+
+```bash
+# Check driver status
+modelgrid gpu drivers
+```
+
+## Greenlit Models
+
+ModelGrid uses a greenlit model system to control which models can be auto-pulled. The greenlist is fetched from a configurable URL and contains approved models with VRAM requirements:
+
+```json
+{
+  "version": "1.0",
+  "models": [
+    { "name": "llama3:8b", "container": "ollama", "minVram": 8 },
+    { "name": "mistral:7b", "container": "ollama", "minVram": 8 },
+    { "name": "llama3:70b", "container": "vllm", "minVram": 48 }
+  ]
+}
+```
+
+When a request comes in for a model not currently loaded:
+1. Check if model is in the greenlist
+2. Verify VRAM requirements can be met
+3. Auto-pull and load the model
+4. Serve the request
+
+## API Reference
+
+### Chat Completions
+
+```
+POST /v1/chat/completions
+```
+
+OpenAI-compatible chat completion endpoint with streaming support.
+
+### Models
+
+```
+GET /v1/models
+GET /v1/models/:model
+```
+
+List available models or get details for a specific model.
+
+### Embeddings
+
+```
+POST /v1/embeddings
+```
+
+Generate text embeddings using compatible models.
+
+## Development
+
+### Building from Source
+
+```bash
+# Clone repository
+git clone https://code.foss.global/modelgrid.com/modelgrid.git
+cd modelgrid
+
+# Run directly with Deno
+deno run --allow-all mod.ts help
+
+# Compile for current platform
+deno compile --allow-all --output modelgrid mod.ts
+
+# Compile for all platforms
+bash scripts/compile-all.sh
+```
+
+### Project Structure
+
+```
+modelgrid/
+├── mod.ts                  # Entry point
+├── ts/
+│   ├── cli.ts              # CLI command routing
+│   ├── modelgrid.ts        # Main coordinator class
+│   ├── daemon.ts           # Background daemon
+│   ├── systemd.ts          # Systemd service management
+│   ├── constants.ts        # Configuration constants
+│   ├── interfaces/         # TypeScript interfaces
+│   ├── hardware/           # GPU detection
+│   ├── drivers/            # Driver management
+│   ├── docker/             # Docker management
+│   ├── containers/         # Container orchestration
+│   ├── api/                # OpenAI-compatible API
+│   ├── models/             # Model management
+│   └── cli/                # CLI handlers
+├── test/                   # Test files
+└── scripts/                # Build scripts
+```
+
+## License
+
+MIT License - See [license](./license) for details.
+
+## Links
+
+- Repository: https://code.foss.global/modelgrid.com/modelgrid
+- Issues: https://community.foss.global/