9.1 KiB
9.1 KiB
ModelGrid Implementation Plan
Goal: GPU infrastructure management daemon with OpenAI-compatible API for AI model containers.
Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ ModelGrid Daemon │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ CLI │ │ Hardware │ │ Container Manager │ │
│ │ Commands │ │ Detection │ │ (Docker/Podman) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Driver │ │ Model │ │ OpenAI API Gateway │ │
│ │ Installer │ │ Registry │ │ (HTTP Server) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ Systemd Service │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Container Runtime │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Ollama │ │ vLLM │ │ TGI │ │ Custom │ │
│ │Container │ │Container │ │Container │ │Container │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────┘
Implementation Status
Completed Components
- Project structure and configuration (deno.json, package.json)
- TypeScript interfaces (ts/interfaces/)
- Logger and colors (ts/logger.ts, ts/colors.ts)
- Helper utilities (ts/helpers/)
- Constants (ts/constants.ts)
- Hardware detection (ts/hardware/)
- Driver management (ts/drivers/)
- Docker management (ts/docker/)
- Container orchestration (ts/containers/)
- Model management (ts/models/)
- OpenAI-compatible API (ts/api/)
- CLI router and handlers (ts/cli.ts, ts/cli/)
- Main coordinator (ts/modelgrid.ts)
- Daemon (ts/daemon.ts)
- Systemd integration (ts/systemd.ts)
- Build scripts (scripts/)
- Installation scripts (install.sh, uninstall.sh)
- CI/CD workflows (.gitea/workflows/)
- npm packaging (package.json, bin/, scripts/)
Pending Tasks
- Integration testing with real GPUs
- End-to-end API testing
- Documentation improvements
- First release (v1.0.0)
Directory Structure
modelgrid/
├── mod.ts # Deno entry point
├── ts/
│ ├── index.ts # Node.js entry point
│ ├── cli.ts # CLI router
│ ├── modelgrid.ts # Main coordinator
│ ├── daemon.ts # Background daemon
│ ├── systemd.ts # Systemd integration
│ ├── constants.ts # Configuration constants
│ ├── logger.ts # Logging utilities
│ ├── colors.ts # Color themes
│ ├── interfaces/ # TypeScript interfaces
│ │ ├── index.ts
│ │ ├── config.ts # IModelGridConfig
│ │ ├── gpu.ts # IGpuInfo, IGpuStatus
│ │ ├── container.ts # IContainerConfig, IContainerStatus
│ │ └── api.ts # OpenAI API types
│ ├── hardware/ # Hardware detection
│ │ ├── index.ts
│ │ ├── gpu-detector.ts # Multi-vendor GPU detection
│ │ └── system-info.ts # System information
│ ├── drivers/ # Driver management
│ │ ├── index.ts
│ │ ├── nvidia.ts # NVIDIA/CUDA
│ │ ├── amd.ts # AMD/ROCm
│ │ ├── intel.ts # Intel Arc/oneAPI
│ │ └── base-driver.ts # Abstract driver class
│ ├── docker/ # Docker management
│ │ ├── index.ts
│ │ ├── docker-manager.ts # Docker operations
│ │ └── container-runtime.ts
│ ├── containers/ # Container orchestration
│ │ ├── index.ts
│ │ ├── ollama.ts # Ollama container
│ │ ├── vllm.ts # vLLM container
│ │ ├── tgi.ts # TGI container
│ │ └── base-container.ts # Abstract container class
│ ├── api/ # OpenAI-compatible API
│ │ ├── index.ts
│ │ ├── server.ts # HTTP server
│ │ ├── router.ts # Request routing
│ │ ├── handlers/ # Endpoint handlers
│ │ │ ├── chat.ts # /v1/chat/completions
│ │ │ ├── models.ts # /v1/models
│ │ │ └── embeddings.ts # /v1/embeddings
│ │ └── middleware/ # Request processing
│ │ ├── auth.ts # API key validation
│ │ ├── sanity.ts # Request validation
│ │ └── proxy.ts # Container proxy
│ ├── models/ # Model management
│ │ ├── index.ts
│ │ ├── registry.ts # Model registry
│ │ └── loader.ts # Model loading
│ └── cli/ # CLI handlers
│ ├── service-handler.ts
│ ├── gpu-handler.ts
│ ├── container-handler.ts
│ ├── model-handler.ts
│ └── config-handler.ts
├── test/ # Test files
├── scripts/ # Build scripts
├── bin/ # npm wrapper
└── docs/ # Documentation
CLI Commands
modelgrid service enable # Install systemd service
modelgrid service disable # Remove systemd service
modelgrid service start # Start daemon
modelgrid service stop # Stop daemon
modelgrid service status # Show status
modelgrid service logs # Show logs
modelgrid gpu list # List detected GPUs
modelgrid gpu status # Show GPU utilization
modelgrid gpu drivers # Check/install drivers
modelgrid container add # Add container config
modelgrid container remove # Remove container
modelgrid container list # List containers
modelgrid container start # Start container
modelgrid container stop # Stop container
modelgrid model list # List available models
modelgrid model pull <name> # Pull model
modelgrid model remove <name> # Remove model
modelgrid config show # Show configuration
modelgrid config init # Initialize configuration
API Endpoints
GET /v1/models- List available modelsGET /v1/models/:model- Get model detailsPOST /v1/chat/completions- Chat completions (streaming supported)POST /v1/embeddings- Generate embeddings
Greenlit Model System
Models are controlled via a remote greenlist to prevent arbitrary downloads:
{
"version": "1.0",
"models": [
{ "name": "llama3:8b", "container": "ollama", "minVram": 8 },
{ "name": "mistral:7b", "container": "ollama", "minVram": 8 },
{ "name": "llama3:70b", "container": "vllm", "minVram": 48 }
]
}
Supported Platforms
- Linux x64 (x86_64)
- Linux ARM64 (aarch64)
- macOS Intel (x86_64)
- macOS Apple Silicon (ARM64)
- Windows x64