The RBLN NPU Operator automates the deployment and management of all Rebellions software components required for provisioning the RBLN NPU family Kubernetes and OpenShift clusters. A singleton RBLNClusterPolicy custom resource orchestrates every operand, ensuring that device plugins, metrics, and VFIO-based passthrough stacks stay aligned with the hardware available in each node pool.
- Lifecycle automation – Automatically manages the full NPU exposure flow, from detecting PCI
1eff:*functions and labeling nodes to deploying the appropriate device plugins per workload type. - Dual workload types – Container mode publishes resources (for example
rebellions.ai/ATOM) through the standard device plugin, while sandbox mode rebinds functions tovfio-pciand advertises VFIO-backed resources such asrebellions.ai/ATOM_CA25_PT.
| Category | Minimum Version | Notes |
|---|---|---|
| Kubernetes | v1.19+ | Validated through v1.32+ |
| OpenShift | 4.19+ | Detected automatically; SCC integration enabled |
| Helm | v3.9 | Required for chart install/upgrade |
| Container Runtime | containerd | Needs hostPath access to /dev, /sys, kubelet plugin dirs |
RBLNClusterPolicy (Cluster-scoped CR)
└── Controller Manager (Singleton reconciliation loop)
├─ Node labeling (NFD dependency, workload labels)
├─ Device Plugin DaemonSet + ConfigMap
├─ Sandbox Device Plugin + VFIO Checker
├─ VFIO Manager DaemonSet
├─ Metrics Exporter DaemonSet
└─ NPU Feature Discovery DaemonSet
- Enable by keeping
spec.workloadType(or the Helm value) set tocontainer, which is the default. - Components:
- Device Plugin publishes
rebellions.ai/ATOMresources. - Metrics Exporter exposes Prometheus-ready telemetry.
- NPU Feature Discovery labels nodes with RBLN hardware inventory.
- Leaves native RBLN drivers bound for container passthrough workloads.
- Device Plugin publishes
- Enable via Helm values or set
spec.workloadType: vm-passthrough - Components:
- NPU Feature Discovery continues to label VFIO-ready nodes so sandbox DaemonSets pin only to hardware that matches the policy.
- VFIO Manager rebinding script (
vfio-manage.sh) detaches vendor devices and binds them tovfio-pci. - Sandbox Device Plugin advertises
rebellions.ai/ATOM_*_PTresources. - VFIO Checker ensures nodes remain in a ready state for KubeVirt.
- KubeVirt integration:
- Enable
HostDevicesfeature gate - Populate
permittedHostDeviceswith vendor selector1eff:XXXX - Reference
rebellions.ai/ATOM_CA25_PTinsideVirtualMachine.spec.template.spec.domain.devices.hostDevices
- Enable
The RBLN NPU Operator Helm chart automatically discovers RBLN NPUs in your cluster, deploys the required device plugins, and monitors the health of each operand. Follow these steps to get up and running quickly.
-
Prerequisites
- Kubernetes 1.19+ cluster with access to
kubectlandhelm - A dedicated namespace such as
rbln-systemis recommended - Worker nodes equipped with NPUs and Node Feature Discovery installed (set
nfd.enabled=trueif you want Helm to deploy it)
- Kubernetes 1.19+ cluster with access to
-
Install Helm (if needed)
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \ && chmod 700 get_helm.sh \ && ./get_helm.sh
-
Add the Rebellions Helm repository
helm repo add rebellions https://rebellions-sw.github.io/rbln-npu-operator helm repo update
-
Install the NPU Operator
helm install --wait --generate-name \ -n rbln-system --create-namespace \ rebellions/rbln-npu-operator
After Helm reports a successful install, confirm that the RBLNClusterPolicy custom resource is present and reconciled:
kubectl get rblnclusterpolicies.rebellions.ai -n rbln-system
NAME AGE
rbln-cluster-policy 8mNext, inspect the operator namespace to verify the health of the controller and operand pods:
kubectl get pods -n rbln-system
NAME READY STATUS AGE
controller-manager-797798d7b8-rjzht 1/1 Running 8m
rbln-device-plugin-4qgxc 1/1 Running 8m
rbln-metrics-exporter-jghbg 1/1 Running 8m
rbln-npu-feature-discovery-zg47r 1/1 Running 8m- Helm chart & source: RBLN NPU Operator Helm chart
- Issues & feature requests: open a GitHub issue in this repository.
- Kubebuilder reference: kubebuilder.io
- For cluster-specific guidance, contact your Rebellions support representative.