-
Notifications
You must be signed in to change notification settings - Fork 211
Closed
Labels
Description
Steps to reproduce
Given a cluster with N GPUs per node:
$ dstack offer --cpu 0.. --memory 0.. --disk 0.. --gpu 1.. --backend kubernetes
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 kubernetes (-) cpu=127 mem=1574GB disk=82GB H100:80GB:8 computeinstance-e00ks8pzq59e6a4aqg $0
2 kubernetes (-) cpu=127 mem=1574GB disk=82GB H100:80GB:8 computeinstance-e00qhgbdeza93kq1aj $0
Start a run with a GPU range M..<any> where M < N:
type: dev-environment
name: dev-environment
ide: vscode
resources:
gpu: 4..8Check nvidia-smi and DSTACK_GPUS_NUM/DSTACK_GPUS_PER_NODE:
$ ssh dev-environment
# nvidia-smi -L
GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-f05caec9-a40e-9d3e-7d40-146a037fada0)
GPU 1: NVIDIA H100 80GB HBM3 (UUID: GPU-5c60ccfc-e289-7d91-6fba-56bf56c128b7)
GPU 2: NVIDIA H100 80GB HBM3 (UUID: GPU-1795d012-3487-9e01-9e9a-611913d6d945)
GPU 3: NVIDIA H100 80GB HBM3 (UUID: GPU-8ccd0ca6-b808-93b7-f0bb-bce6e05b5615)
# echo $DSTACK_GPUS_NUM
8
Actual behaviour
No response
Expected behaviour
No response
dstack version
0.20.3
Server logs
Additional information
No response