Files
modelgrid/readme.plan.md
Juergen Kunz daaf6559e3
Some checks failed
CI / Type Check & Lint (push) Failing after 5s
CI / Build Test (Current Platform) (push) Failing after 5s
CI / Build All Platforms (push) Successful in 49s
initial
2026-01-30 03:16:57 +00:00

9.1 KiB

ModelGrid Implementation Plan

Goal: GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.


Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        ModelGrid Daemon                          │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   CLI       │  │  Hardware   │  │   Container Manager     │  │
│  │  Commands   │  │  Detection  │  │  (Docker/Podman)        │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   Driver    │  │   Model     │  │   OpenAI API Gateway    │  │
│  │  Installer  │  │  Registry   │  │  (HTTP Server)          │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
├─────────────────────────────────────────────────────────────────┤
│                     Systemd Service                              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Container Runtime                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │ Ollama   │  │  vLLM    │  │   TGI    │  │ Custom   │        │
│  │Container │  │Container │  │Container │  │Container │        │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘        │
└─────────────────────────────────────────────────────────────────┘

Implementation Status

Completed Components

  • Project structure and configuration (deno.json, package.json)
  • TypeScript interfaces (ts/interfaces/)
  • Logger and colors (ts/logger.ts, ts/colors.ts)
  • Helper utilities (ts/helpers/)
  • Constants (ts/constants.ts)
  • Hardware detection (ts/hardware/)
  • Driver management (ts/drivers/)
  • Docker management (ts/docker/)
  • Container orchestration (ts/containers/)
  • Model management (ts/models/)
  • OpenAI-compatible API (ts/api/)
  • CLI router and handlers (ts/cli.ts, ts/cli/)
  • Main coordinator (ts/modelgrid.ts)
  • Daemon (ts/daemon.ts)
  • Systemd integration (ts/systemd.ts)
  • Build scripts (scripts/)
  • Installation scripts (install.sh, uninstall.sh)
  • CI/CD workflows (.gitea/workflows/)
  • npm packaging (package.json, bin/, scripts/)

Pending Tasks

  • Integration testing with real GPUs
  • End-to-end API testing
  • Documentation improvements
  • First release (v1.0.0)

Directory Structure

modelgrid/
├── mod.ts                    # Deno entry point
├── ts/
│   ├── index.ts              # Node.js entry point
│   ├── cli.ts                # CLI router
│   ├── modelgrid.ts          # Main coordinator
│   ├── daemon.ts             # Background daemon
│   ├── systemd.ts            # Systemd integration
│   ├── constants.ts          # Configuration constants
│   ├── logger.ts             # Logging utilities
│   ├── colors.ts             # Color themes
│   ├── interfaces/           # TypeScript interfaces
│   │   ├── index.ts
│   │   ├── config.ts         # IModelGridConfig
│   │   ├── gpu.ts            # IGpuInfo, IGpuStatus
│   │   ├── container.ts      # IContainerConfig, IContainerStatus
│   │   └── api.ts            # OpenAI API types
│   ├── hardware/             # Hardware detection
│   │   ├── index.ts
│   │   ├── gpu-detector.ts   # Multi-vendor GPU detection
│   │   └── system-info.ts    # System information
│   ├── drivers/              # Driver management
│   │   ├── index.ts
│   │   ├── nvidia.ts         # NVIDIA/CUDA
│   │   ├── amd.ts            # AMD/ROCm
│   │   ├── intel.ts          # Intel Arc/oneAPI
│   │   └── base-driver.ts    # Abstract driver class
│   ├── docker/               # Docker management
│   │   ├── index.ts
│   │   ├── docker-manager.ts # Docker operations
│   │   └── container-runtime.ts
│   ├── containers/           # Container orchestration
│   │   ├── index.ts
│   │   ├── ollama.ts         # Ollama container
│   │   ├── vllm.ts           # vLLM container
│   │   ├── tgi.ts            # TGI container
│   │   └── base-container.ts # Abstract container class
│   ├── api/                  # OpenAI-compatible API
│   │   ├── index.ts
│   │   ├── server.ts         # HTTP server
│   │   ├── router.ts         # Request routing
│   │   ├── handlers/         # Endpoint handlers
│   │   │   ├── chat.ts       # /v1/chat/completions
│   │   │   ├── models.ts     # /v1/models
│   │   │   └── embeddings.ts # /v1/embeddings
│   │   └── middleware/       # Request processing
│   │       ├── auth.ts       # API key validation
│   │       ├── sanity.ts     # Request validation
│   │       └── proxy.ts      # Container proxy
│   ├── models/               # Model management
│   │   ├── index.ts
│   │   ├── registry.ts       # Model registry
│   │   └── loader.ts         # Model loading
│   └── cli/                  # CLI handlers
│       ├── service-handler.ts
│       ├── gpu-handler.ts
│       ├── container-handler.ts
│       ├── model-handler.ts
│       └── config-handler.ts
├── test/                     # Test files
├── scripts/                  # Build scripts
├── bin/                      # npm wrapper
└── docs/                     # Documentation

CLI Commands

modelgrid service enable      # Install systemd service
modelgrid service disable     # Remove systemd service
modelgrid service start       # Start daemon
modelgrid service stop        # Stop daemon
modelgrid service status      # Show status
modelgrid service logs        # Show logs

modelgrid gpu list            # List detected GPUs
modelgrid gpu status          # Show GPU utilization
modelgrid gpu drivers         # Check/install drivers

modelgrid container add       # Add container config
modelgrid container remove    # Remove container
modelgrid container list      # List containers
modelgrid container start     # Start container
modelgrid container stop      # Stop container

modelgrid model list          # List available models
modelgrid model pull <name>   # Pull model
modelgrid model remove <name> # Remove model

modelgrid config show         # Show configuration
modelgrid config init         # Initialize configuration

API Endpoints

  • GET /v1/models - List available models
  • GET /v1/models/:model - Get model details
  • POST /v1/chat/completions - Chat completions (streaming supported)
  • POST /v1/embeddings - Generate embeddings

Greenlit Model System

Models are controlled via a remote greenlist to prevent arbitrary downloads:

{
  "version": "1.0",
  "models": [
    { "name": "llama3:8b", "container": "ollama", "minVram": 8 },
    { "name": "mistral:7b", "container": "ollama", "minVram": 8 },
    { "name": "llama3:70b", "container": "vllm", "minVram": 48 }
  ]
}

Supported Platforms

  • Linux x64 (x86_64)
  • Linux ARM64 (aarch64)
  • macOS Intel (x86_64)
  • macOS Apple Silicon (ARM64)
  • Windows x64