readme.md

# ModelGrid

**GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.**

ModelGrid is a root-level daemon that manages GPU infrastructure, Docker containers, and AI model serving. It provides an OpenAI-compatible API interface for seamless integration with existing tools and applications.

## Features

- **Multi-GPU Support**: Detect and manage NVIDIA (CUDA), AMD (ROCm), and Intel Arc (oneAPI) GPUs
- **Container Management**: Orchestrate Ollama, vLLM, and TGI containers with GPU passthrough
- **OpenAI-Compatible API**: Drop-in replacement API for chat completions, embeddings, and model management
- **Greenlit Models**: Controlled model auto-pulling with remote configuration
- **Systemd Integration**: Run as a system service with automatic startup
- **Cross-Platform**: Pre-compiled binaries for Linux, macOS, and Windows

## Quick Start

### Installation

```bash
# Via npm (recommended)
npm install -g @modelgrid.com/modelgrid

# Via installer script
curl -sSL https://code.foss.global/modelgrid.com/modelgrid/raw/branch/main/install.sh | sudo bash
```

### Initial Setup

```bash
# 1. Check GPU detection
sudo modelgrid gpu list

# 2. Initialize configuration
sudo modelgrid config init

# 3. Enable and start the service
sudo modelgrid service enable
sudo modelgrid service start

# 4. Check status
modelgrid service status
```

### Using the API

Once running, ModelGrid exposes an OpenAI-compatible API:

```bash
# List available models
curl http://localhost:8080/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

# Chat completion
curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3:8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

## CLI Commands

### Service Management

```bash
modelgrid service enable      # Install and enable systemd service
modelgrid service disable     # Stop and disable systemd service
modelgrid service start       # Start the service
modelgrid service stop        # Stop the service
modelgrid service status      # Show service status
modelgrid service logs        # Show service logs
```

### GPU Management

```bash
modelgrid gpu list            # List detected GPUs
modelgrid gpu status          # Show GPU utilization
modelgrid gpu drivers         # Check/install GPU drivers
```

### Container Management

```bash
modelgrid container add       # Add a new container
modelgrid container remove    # Remove a container
modelgrid container list      # List all containers
modelgrid container start     # Start a container
modelgrid container stop      # Stop a container
```

### Model Management

```bash
modelgrid model list          # List available/loaded models
modelgrid model pull <name>   # Pull a model
modelgrid model remove <name> # Remove a model
```

### Configuration

```bash
modelgrid config show         # Display current configuration
modelgrid config init         # Initialize configuration
```

## Configuration

Configuration is stored at `/etc/modelgrid/config.json`:

```json
{
  "version": "1.0",
  "api": {
    "port": 8080,
    "host": "0.0.0.0",
    "apiKeys": ["your-api-key-here"]
  },
  "docker": {
    "networkName": "modelgrid",
    "runtime": "docker"
  },
  "gpus": {
    "autoDetect": true,
    "assignments": {}
  },
  "containers": [],
  "models": {
    "greenlistUrl": "https://code.foss.global/modelgrid.com/model_lists/raw/branch/main/greenlit.json",
    "autoPull": true,
    "defaultContainer": "ollama",
    "autoLoad": []
  },
  "checkInterval": 30000
}
```

## Supported Container Types

### Ollama

Best for general-purpose model serving with easy model management.

```bash
modelgrid container add --type ollama --gpu gpu-0
```

### vLLM

High-performance serving for large models with tensor parallelism.

```bash
modelgrid container add --type vllm --gpu gpu-0,gpu-1
```

### TGI (Text Generation Inference)

HuggingFace's production-ready inference server.

```bash
modelgrid container add --type tgi --gpu gpu-0
```

## GPU Support

### NVIDIA (CUDA)

Requires NVIDIA drivers and NVIDIA Container Toolkit:

```bash
# Check driver status
modelgrid gpu drivers

# Install if needed (Ubuntu/Debian)
sudo apt install nvidia-driver-535 nvidia-container-toolkit
```

### AMD (ROCm)

Requires ROCm drivers:

```bash
# Check driver status
modelgrid gpu drivers
```

### Intel Arc (oneAPI)

Requires Intel GPU drivers and oneAPI toolkit:

```bash
# Check driver status
modelgrid gpu drivers
```

## Greenlit Models

ModelGrid uses a greenlit model system to control which models can be auto-pulled. The greenlist is fetched from a configurable URL and contains approved models with VRAM requirements:

```json
{
  "version": "1.0",
  "models": [
    { "name": "llama3:8b", "container": "ollama", "minVram": 8 },
    { "name": "mistral:7b", "container": "ollama", "minVram": 8 },
    { "name": "llama3:70b", "container": "vllm", "minVram": 48 }
  ]
}
```

When a request comes in for a model not currently loaded:
1. Check if model is in the greenlist
2. Verify VRAM requirements can be met
3. Auto-pull and load the model
4. Serve the request

## API Reference

### Chat Completions

```
POST /v1/chat/completions
```

OpenAI-compatible chat completion endpoint with streaming support.

### Models

```
GET /v1/models
GET /v1/models/:model
```

List available models or get details for a specific model.

### Embeddings

```
POST /v1/embeddings
```

Generate text embeddings using compatible models.

## Development

### Building from Source

```bash
# Clone repository
git clone https://code.foss.global/modelgrid.com/modelgrid.git
cd modelgrid

# Run directly with Deno
deno run --allow-all mod.ts help

# Compile for current platform
deno compile --allow-all --output modelgrid mod.ts

# Compile for all platforms
bash scripts/compile-all.sh
```

### Project Structure

```
modelgrid/
├── mod.ts                  # Entry point
├── ts/
│   ├── cli.ts              # CLI command routing
│   ├── modelgrid.ts        # Main coordinator class
│   ├── daemon.ts           # Background daemon
│   ├── systemd.ts          # Systemd service management
│   ├── constants.ts        # Configuration constants
│   ├── interfaces/         # TypeScript interfaces
│   ├── hardware/           # GPU detection
│   ├── drivers/            # Driver management
│   ├── docker/             # Docker management
│   ├── containers/         # Container orchestration
│   ├── api/                # OpenAI-compatible API
│   ├── models/             # Model management
│   └── cli/                # CLI handlers
├── test/                   # Test files
└── scripts/                # Build scripts
```

## License

MIT License - See [license](./license) for details.

## Links

- Repository: https://code.foss.global/modelgrid.com/modelgrid
- Issues: https://community.foss.global/
initial 2026-01-30 03:16:57 +00:00			`# ModelGrid`

			`GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.`

			`ModelGrid is a root-level daemon that manages GPU infrastructure, Docker containers, and AI model serving. It provides an OpenAI-compatible API interface for seamless integration with existing tools and applications.`

			`## Features`

			`- Multi-GPU Support: Detect and manage NVIDIA (CUDA), AMD (ROCm), and Intel Arc (oneAPI) GPUs`
			`- Container Management: Orchestrate Ollama, vLLM, and TGI containers with GPU passthrough`
			`- OpenAI-Compatible API: Drop-in replacement API for chat completions, embeddings, and model management`
			`- Greenlit Models: Controlled model auto-pulling with remote configuration`
			`- Systemd Integration: Run as a system service with automatic startup`
			`- Cross-Platform: Pre-compiled binaries for Linux, macOS, and Windows`

			`## Quick Start`

			`### Installation`

			```bash
			`# Via npm (recommended)`
			`npm install -g @modelgrid.com/modelgrid`

			`# Via installer script`
			`curl -sSL https://code.foss.global/modelgrid.com/modelgrid/raw/branch/main/install.sh \| sudo bash`
			```

			`### Initial Setup`

			```bash
			`# 1. Check GPU detection`
			`sudo modelgrid gpu list`

			`# 2. Initialize configuration`
			`sudo modelgrid config init`

			`# 3. Enable and start the service`
			`sudo modelgrid service enable`
			`sudo modelgrid service start`

			`# 4. Check status`
			`modelgrid service status`
			```

			`### Using the API`

			`Once running, ModelGrid exposes an OpenAI-compatible API:`

			```bash
			`# List available models`
			`curl http://localhost:8080/v1/models \`
			`-H "Authorization: Bearer YOUR_API_KEY"`

			`# Chat completion`
			`curl http://localhost:8080/v1/chat/completions \`
			`-H "Authorization: Bearer YOUR_API_KEY" \`
			`-H "Content-Type: application/json" \`
			`-d '{`
			`"model": "llama3:8b",`
			`"messages": [{"role": "user", "content": "Hello!"}]`
			`}'`
			```

			`## CLI Commands`

			`### Service Management`

			```bash
			`modelgrid service enable # Install and enable systemd service`
			`modelgrid service disable # Stop and disable systemd service`
			`modelgrid service start # Start the service`
			`modelgrid service stop # Stop the service`
			`modelgrid service status # Show service status`
			`modelgrid service logs # Show service logs`
			```

			`### GPU Management`

			```bash
			`modelgrid gpu list # List detected GPUs`
			`modelgrid gpu status # Show GPU utilization`
			`modelgrid gpu drivers # Check/install GPU drivers`
			```

			`### Container Management`

			```bash
			`modelgrid container add # Add a new container`
			`modelgrid container remove # Remove a container`
			`modelgrid container list # List all containers`
			`modelgrid container start # Start a container`
			`modelgrid container stop # Stop a container`
			```

			`### Model Management`

			```bash
			`modelgrid model list # List available/loaded models`
			`modelgrid model pull <name> # Pull a model`
			`modelgrid model remove <name> # Remove a model`
			```

			`### Configuration`

			```bash
			`modelgrid config show # Display current configuration`
			`modelgrid config init # Initialize configuration`
			```

			`## Configuration`

			Configuration is stored at `/etc/modelgrid/config.json`:

			```json
			`{`
			`"version": "1.0",`
			`"api": {`
			`"port": 8080,`
			`"host": "0.0.0.0",`
			`"apiKeys": ["your-api-key-here"]`
			`},`
			`"docker": {`
			`"networkName": "modelgrid",`
			`"runtime": "docker"`
			`},`
			`"gpus": {`
			`"autoDetect": true,`
			`"assignments": {}`
			`},`
			`"containers": [],`
			`"models": {`
			`"greenlistUrl": "https://code.foss.global/modelgrid.com/model_lists/raw/branch/main/greenlit.json",`
			`"autoPull": true,`
			`"defaultContainer": "ollama",`
			`"autoLoad": []`
			`},`
			`"checkInterval": 30000`
			`}`
			```

			`## Supported Container Types`

			`### Ollama`

			`Best for general-purpose model serving with easy model management.`

			```bash
			`modelgrid container add --type ollama --gpu gpu-0`
			```

			`### vLLM`

			`High-performance serving for large models with tensor parallelism.`

			```bash
			`modelgrid container add --type vllm --gpu gpu-0,gpu-1`
			```

			`### TGI (Text Generation Inference)`

			`HuggingFace's production-ready inference server.`

			```bash
			`modelgrid container add --type tgi --gpu gpu-0`
			```

			`## GPU Support`

			`### NVIDIA (CUDA)`

			`Requires NVIDIA drivers and NVIDIA Container Toolkit:`

			```bash
			`# Check driver status`
			`modelgrid gpu drivers`

			`# Install if needed (Ubuntu/Debian)`
			`sudo apt install nvidia-driver-535 nvidia-container-toolkit`
			```

			`### AMD (ROCm)`

			`Requires ROCm drivers:`

			```bash
			`# Check driver status`
			`modelgrid gpu drivers`
			```

			`### Intel Arc (oneAPI)`

			`Requires Intel GPU drivers and oneAPI toolkit:`

			```bash
			`# Check driver status`
			`modelgrid gpu drivers`
			```

			`## Greenlit Models`

			`ModelGrid uses a greenlit model system to control which models can be auto-pulled. The greenlist is fetched from a configurable URL and contains approved models with VRAM requirements:`

			```json
			`{`
			`"version": "1.0",`
			`"models": [`
			`{ "name": "llama3:8b", "container": "ollama", "minVram": 8 },`
			`{ "name": "mistral:7b", "container": "ollama", "minVram": 8 },`
			`{ "name": "llama3:70b", "container": "vllm", "minVram": 48 }`
			`]`
			`}`
			```

			`When a request comes in for a model not currently loaded:`
			`1. Check if model is in the greenlist`
			`2. Verify VRAM requirements can be met`
			`3. Auto-pull and load the model`
			`4. Serve the request`

			`## API Reference`

			`### Chat Completions`

			```
			`POST /v1/chat/completions`
			```

			`OpenAI-compatible chat completion endpoint with streaming support.`

			`### Models`

			```
			`GET /v1/models`
			`GET /v1/models/:model`
			```

			`List available models or get details for a specific model.`

			`### Embeddings`

			```
			`POST /v1/embeddings`
			```

			`Generate text embeddings using compatible models.`

			`## Development`

			`### Building from Source`

			```bash
			`# Clone repository`
			`git clone https://code.foss.global/modelgrid.com/modelgrid.git`
			`cd modelgrid`

			`# Run directly with Deno`
			`deno run --allow-all mod.ts help`

			`# Compile for current platform`
			`deno compile --allow-all --output modelgrid mod.ts`

			`# Compile for all platforms`
			`bash scripts/compile-all.sh`
			```

			`### Project Structure`

			```
			`modelgrid/`
			`├── mod.ts # Entry point`
			`├── ts/`
			`│ ├── cli.ts # CLI command routing`
			`│ ├── modelgrid.ts # Main coordinator class`
			`│ ├── daemon.ts # Background daemon`
			`│ ├── systemd.ts # Systemd service management`
			`│ ├── constants.ts # Configuration constants`
			`│ ├── interfaces/ # TypeScript interfaces`
			`│ ├── hardware/ # GPU detection`
			`│ ├── drivers/ # Driver management`
			`│ ├── docker/ # Docker management`
			`│ ├── containers/ # Container orchestration`
			`│ ├── api/ # OpenAI-compatible API`
			`│ ├── models/ # Model management`
			`│ └── cli/ # CLI handlers`
			`├── test/ # Test files`
			`└── scripts/ # Build scripts`
			```

			`## License`

			`MIT License - See [license](./license) for details.`

			`## Links`

			`- Repository: https://code.foss.global/modelgrid.com/modelgrid`
			`- Issues: https://community.foss.global/`