NusapodDocsRequest a license

Requirements

Control plane

The control plane is the management hub — it runs K3s, GPU VRAM partitioning, the NUSAPOD backend, and the web console. It does not need a GPU.

  • OS: Ubuntu 22.04 / 24.04 or Debian 12 (recommended); RHEL / Rocky / Alma 9 works. x86_64 or arm64.
  • Minimum hardware: 2 vCPU / 4 GB RAM / 20 GB disk (more for long Prometheus retention or many pods).
  • Access: root or sudo on the host.
  • Network: a stable IP or DNS name; clock NTP-synced.
  • License: a valid license token is required to install.

Prerequisites

You need a license token before running the installer. Request one — we will send it to you by email.

Network & ports

DirectionPortPurpose
Inbound → control plane6443/TCPGPU nodes join the K3s API
Inbound → control plane80 / 443Console + API; nodes fetch /install.sh
Outbound ← control plane443get.nusapod.app, registry.nusapod.app, api.nusapod.app, get.k3s.io, Helm chart repos
Outbound ← GPU node443 + 6443NVIDIA toolkit repo + K3s API (nodes dial out; no inbound)

GPU node

GPU nodes supply the VRAM that NUSAPOD slices and schedules. Each node must have an NVIDIA GPU with the driver already installed — verify before adding the node:

bash
nvidia-smi

If nvidia-smi fails, install the kernel driver first (Ubuntu: sudo ubuntu-drivers install) and reboot. The NUSAPOD installer adds nvidia-container-toolkit and sets nvidia as the default containerd runtime automatically.

  • Same OS tiers as the control plane (Ubuntu 22.04 / 24.04, Debian 12, RHEL 9).
  • Nodes only dial out — no inbound ports need to be opened on the GPU host.

See the GPU node install guide for the full join walkthrough.

License & quota

  • max_gpus quota is enforced across the fleet — the scheduler will not admit pods that would exceed it. This GPU count is the only usage limit.
  • Expiry is strict at image pull / activation, with a grace window at runtime.

Limitations & good to know

  • Single-operator, role-based(viewer < operator < admin) — there is no multi-tenant / workspace model.
  • VRAM partitioning is done in software (per-pod VRAM limits + compute share — not hardware MIG partitioning).
  • The control plane is a single K3s server — no built-in HA.
  • Models are served with vLLM, exposed as an OpenAI-compatible endpoint; GPUs must be CUDA-capable.
  • Internet access is required during install and operation. Air-gapped installs are possible but advanced — contact us.