Control plane
The control plane is the management hub — it runs K3s, GPU VRAM partitioning, the NUSAPOD backend, and the web console. It does not need a GPU.
- OS: Ubuntu 22.04 / 24.04 or Debian 12 (recommended); RHEL / Rocky / Alma 9 works. x86_64 or arm64.
- Minimum hardware: 2 vCPU / 4 GB RAM / 20 GB disk (more for long Prometheus retention or many pods).
- Access: root or
sudoon the host. - Network: a stable IP or DNS name; clock NTP-synced.
- License: a valid license token is required to install.
Prerequisites
You need a license token before running the installer. Request one — we will send it to you by email.
Network & ports
| Direction | Port | Purpose |
|---|---|---|
| Inbound → control plane | 6443/TCP | GPU nodes join the K3s API |
| Inbound → control plane | 80 / 443 | Console + API; nodes fetch /install.sh |
| Outbound ← control plane | 443 | get.nusapod.app, registry.nusapod.app, api.nusapod.app, get.k3s.io, Helm chart repos |
| Outbound ← GPU node | 443 + 6443 | NVIDIA toolkit repo + K3s API (nodes dial out; no inbound) |
GPU node
GPU nodes supply the VRAM that NUSAPOD slices and schedules. Each node must have an NVIDIA GPU with the driver already installed — verify before adding the node:
bash
nvidia-smiIf nvidia-smi fails, install the kernel driver first (Ubuntu: sudo ubuntu-drivers install) and reboot. The NUSAPOD installer adds nvidia-container-toolkit and sets nvidia as the default containerd runtime automatically.
- Same OS tiers as the control plane (Ubuntu 22.04 / 24.04, Debian 12, RHEL 9).
- Nodes only dial out — no inbound ports need to be opened on the GPU host.
See the GPU node install guide for the full join walkthrough.
License & quota
max_gpusquota is enforced across the fleet — the scheduler will not admit pods that would exceed it. This GPU count is the only usage limit.- Expiry is strict at image pull / activation, with a grace window at runtime.
Limitations & good to know
- Single-operator, role-based(viewer < operator < admin) — there is no multi-tenant / workspace model.
- VRAM partitioning is done in software (per-pod VRAM limits + compute share — not hardware MIG partitioning).
- The control plane is a single K3s server — no built-in HA.
- Models are served with vLLM, exposed as an OpenAI-compatible endpoint; GPUs must be CUDA-capable.
- Internet access is required during install and operation. Air-gapped installs are possible but advanced — contact us.
