Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

janetkuo/kubecon-demo

Repository files navigation

KubeCon Demo: Kubernetes AI Conformance

Kubernetes AI Conformance Program: https://github.com/cncf/k8s-ai-conformance

Cluster Setup

  • GKE 1.34 standard cluster with a DRA node pool with L4 GPUs
    • See more details in set up DRA
    • Note: Creating Spot VM node pools is usually easier to obtain GPUs
Detailed set up steps before running the demo
gcloud container clusters create ${CLUSTER_NAME} \
    --project=${PROJECT_ID} \
    --location=${LOCATION} \
    --release-channel=rapid \
    --num-nodes=1 \
    --enable-managed-prometheus \
    --cluster-version="1.34.1-gke.2037000" \
    --monitoring=SYSTEM,DCGM

gcloud container node-pools create drapool \
    --project=${PROJECT_ID} \
    --cluster=${CLUSTER_NAME} \
    --location=${LOCATION} \
    --node-locations=${LOCATION}-b \
    --machine-type "g2-standard-24" \
    --accelerator "type=nvidia-l4,count=2,gpu-driver-version=disabled" \
    --spot \
    --num-nodes "1" \
    --node-version="1.34.1-gke.2037000" \
    --node-labels=gke-no-default-nvidia-gpu-device-plugin=true,nvidia.com/gpu.present=true

You need to create a secret that contains Hugging Face token to download models in your vLLM service

kubectl create secret generic hf-secret \
    --from-literal=hf_api_token=${HF_TOKEN} \
    --dry-run=client -o yaml | kubectl apply -f -

To install GPU and DRA drivers:

# Install GPU driver
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml

# Install DRA drivers
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
    && helm repo update

helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu --version="25.8.0" --create-namespace --namespace nvidia-dra-driver-gpu \
    --set nvidiaDriverRoot="/home/kubernetes/bin/nvidia/" \
    --set gpuResourcesEnabledOverride=true \
    --set resources.computeDomains.enabled=false \
    --set kubeletPlugin.priorityClassName="" \
    --set kubeletPlugin.tolerations[0].key=nvidia.com/gpu \
    --set kubeletPlugin.tolerations[0].operator=Exists \
    --set kubeletPlugin.tolerations[0].effect=NoSchedule

For metrics pipeline:

# Install the Custom Metrics Stackdriver Adapter to make the custom metric you exported to monitoring visible to the HPA controller
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

# For configuring GPU metrics for DRA nodepools, given that DRA requires disabling the default device plugin
kubectl apply -f dcgm-exporter-for-hpa.yaml

Demo

Start a demo by running ./run-demo.sh, which uses demo magic to type commands.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages