modelgrid/readme.plan.md

# ModelGrid Implementation Plan

**Goal**: GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        ModelGrid Daemon                          │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   CLI       │  │  Hardware   │  │   Container Manager     │  │
│  │  Commands   │  │  Detection  │  │  (Docker/Podman)        │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   Driver    │  │   Model     │  │   OpenAI API Gateway    │  │
│  │  Installer  │  │  Registry   │  │  (HTTP Server)          │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
├─────────────────────────────────────────────────────────────────┤
│                     Systemd Service                              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Container Runtime                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │ Ollama   │  │  vLLM    │  │   TGI    │  │ Custom   │        │
│  │Container │  │Container │  │Container │  │Container │        │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘        │
└─────────────────────────────────────────────────────────────────┘
```

---

## Implementation Status

### Completed Components

- [x] Project structure and configuration (deno.json, package.json)
- [x] TypeScript interfaces (ts/interfaces/)
- [x] Logger and colors (ts/logger.ts, ts/colors.ts)
- [x] Helper utilities (ts/helpers/)
- [x] Constants (ts/constants.ts)
- [x] Hardware detection (ts/hardware/)
- [x] Driver management (ts/drivers/)
- [x] Docker management (ts/docker/)
- [x] Container orchestration (ts/containers/)
- [x] Model management (ts/models/)
- [x] OpenAI-compatible API (ts/api/)
- [x] CLI router and handlers (ts/cli.ts, ts/cli/)
- [x] Main coordinator (ts/modelgrid.ts)
- [x] Daemon (ts/daemon.ts)
- [x] Systemd integration (ts/systemd.ts)
- [x] Build scripts (scripts/)
- [x] Installation scripts (install.sh, uninstall.sh)
- [x] CI/CD workflows (.gitea/workflows/)
- [x] npm packaging (package.json, bin/, scripts/)

### Pending Tasks

- [ ] Integration testing with real GPUs
- [ ] End-to-end API testing
- [ ] Documentation improvements
- [ ] First release (v1.0.0)

---

## Directory Structure

```
modelgrid/
├── mod.ts                    # Deno entry point
├── ts/
│   ├── index.ts              # Node.js entry point
│   ├── cli.ts                # CLI router
│   ├── modelgrid.ts          # Main coordinator
│   ├── daemon.ts             # Background daemon
│   ├── systemd.ts            # Systemd integration
│   ├── constants.ts          # Configuration constants
│   ├── logger.ts             # Logging utilities
│   ├── colors.ts             # Color themes
│   ├── interfaces/           # TypeScript interfaces
│   │   ├── index.ts
│   │   ├── config.ts         # IModelGridConfig
│   │   ├── gpu.ts            # IGpuInfo, IGpuStatus
│   │   ├── container.ts      # IContainerConfig, IContainerStatus
│   │   └── api.ts            # OpenAI API types
│   ├── hardware/             # Hardware detection
│   │   ├── index.ts
│   │   ├── gpu-detector.ts   # Multi-vendor GPU detection
│   │   └── system-info.ts    # System information
│   ├── drivers/              # Driver management
│   │   ├── index.ts
│   │   ├── nvidia.ts         # NVIDIA/CUDA
│   │   ├── amd.ts            # AMD/ROCm
│   │   ├── intel.ts          # Intel Arc/oneAPI
│   │   └── base-driver.ts    # Abstract driver class
│   ├── docker/               # Docker management
│   │   ├── index.ts
│   │   ├── docker-manager.ts # Docker operations
│   │   └── container-runtime.ts
│   ├── containers/           # Container orchestration
│   │   ├── index.ts
│   │   ├── ollama.ts         # Ollama container
│   │   ├── vllm.ts           # vLLM container
│   │   ├── tgi.ts            # TGI container
│   │   └── base-container.ts # Abstract container class
│   ├── api/                  # OpenAI-compatible API
│   │   ├── index.ts
│   │   ├── server.ts         # HTTP server
│   │   ├── router.ts         # Request routing
│   │   ├── handlers/         # Endpoint handlers
│   │   │   ├── chat.ts       # /v1/chat/completions
│   │   │   ├── models.ts     # /v1/models
│   │   │   └── embeddings.ts # /v1/embeddings
│   │   └── middleware/       # Request processing
│   │       ├── auth.ts       # API key validation
│   │       ├── sanity.ts     # Request validation
│   │       └── proxy.ts      # Container proxy
│   ├── models/               # Model management
│   │   ├── index.ts
│   │   ├── registry.ts       # Model registry
│   │   └── loader.ts         # Model loading
│   └── cli/                  # CLI handlers
│       ├── service-handler.ts
│       ├── gpu-handler.ts
│       ├── container-handler.ts
│       ├── model-handler.ts
│       └── config-handler.ts
├── test/                     # Test files
├── scripts/                  # Build scripts
├── bin/                      # npm wrapper
└── docs/                     # Documentation
```

---

## CLI Commands

```
modelgrid service enable      # Install systemd service
modelgrid service disable     # Remove systemd service
modelgrid service start       # Start daemon
modelgrid service stop        # Stop daemon
modelgrid service status      # Show status
modelgrid service logs        # Show logs

modelgrid gpu list            # List detected GPUs
modelgrid gpu status          # Show GPU utilization
modelgrid gpu drivers         # Check/install drivers

modelgrid container add       # Add container config
modelgrid container remove    # Remove container
modelgrid container list      # List containers
modelgrid container start     # Start container
modelgrid container stop      # Stop container

modelgrid model list          # List available models
modelgrid model pull <name>   # Pull model
modelgrid model remove <name> # Remove model

modelgrid config show         # Show configuration
modelgrid config init         # Initialize configuration
```

---

## API Endpoints

- `GET /v1/models` - List available models
- `GET /v1/models/:model` - Get model details
- `POST /v1/chat/completions` - Chat completions (streaming supported)
- `POST /v1/embeddings` - Generate embeddings

---

## Greenlit Model System

Models are controlled via a remote greenlist to prevent arbitrary downloads:

```json
{
  "version": "1.0",
  "models": [
    { "name": "llama3:8b", "container": "ollama", "minVram": 8 },
    { "name": "mistral:7b", "container": "ollama", "minVram": 8 },
    { "name": "llama3:70b", "container": "vllm", "minVram": 48 }
  ]
}
```

---

## Supported Platforms

- Linux x64 (x86_64)
- Linux ARM64 (aarch64)
- macOS Intel (x86_64)
- macOS Apple Silicon (ARM64)
- Windows x64