Releases: dstackai/dstack
0.19.32
Fleets
Nodes
Maximum number of nodes
The fleet nodes.max
property is now respected that allows limiting maximum number of instances allowed in a fleet. For example, to allow at most 10 instances in the fleet, you can do:
type: fleet
name: cloud-fleet
nodes: 0..10
A fleet will be considered for a run only if the run can fit into the fleet without violating nodes.max
. If you don't need to enforce an upper limit, you can omit it:
type: fleet
name: cloud-fleet
nodes: 0..
Backends
Nebius
Tags
Nebius backend now supports backend and resource-level tags to tag cloud resources provisioned via dstack
:
type: nebius
creds:
type: service_account
# ...
tags:
team: my_team
user: jake
Credentials file
It's also possible to configure the nebius
backend using a credentials file generated by the nebius
CLI:
nebius iam auth-public-key generate \
--service-account-id <service account ID> \
--output ~/.nebius/sa-credentials.json
projects:
- name: main
backends:
- type: nebius
creds:
type: service_account
filename: ~/.nebius/sa-credentials.json
Hot Aisle
Hot Aisle backend now supports multi-GPU VMs such as 2xMI300X and 4xMI300X.
dstack apply -f .local/.dstack.yml --gpu amd:2
The working_dir is not set — using legacy default "/workflow". Future versions will default to the
image's working directory.
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 hotaisle cpu=26 mem=448GB disk=12288GB 2x MI300X 26x Xeon… $3.98
(us-michigan-1) MI300X:192GB:2
What's changed
- Fix CLI compatibility with server 0.19.11 by @jvstme in #3145
- [Feature]: Nebius switch to using
nebius iam auth-public-key generate
by @peterschmidt85 in #3147 - [Docs] Move
Plugins
toReference
|Python API
by @peterschmidt85 in #3148 - 404 error on GIT url by @robinnarsinghranabhat in #3149
- Fix idle duration: off and forbid negative durations by @r4victor in #3151
- [Docs]: GCP A4 cluster example by @jvstme in #3152
- Consider multinode replica inactive only if all jobs done by @r4victor in #3157
- Kubernetes: add NVIDIA GPU toleration by @un-def in #3160
- [Nebius] Support tags by @peterschmidt85 in #3158
- [Hot Aisle] Support multi-GPU VMs by @peterschmidt85 in #3154
- feat(docker): upgrade litestream to v0.5.0 by @DragonStuff in #3165
- [Blog] Orchestrating GPU workloads on Kubernetes by @peterschmidt85 in #3161
- Respect fleet
nodes.max
by @r4victor in #3164 - Fix kubeconfig via data reference by @r4victor in #3170
- [Docs] Fix kubernetes typos by @svanzoest in #3169
New contributors
- @robinnarsinghranabhat made their first contribution in #3149
- @DragonStuff made their first contribution in #3165
- @svanzoest made their first contribution in #3169
Full changelog: 0.19.31...0.19.32
0.19.31
Kubernetes
The kubernetes
backend introduces many significant improvements and has now graduated from alpha to beta. It is much more stable and can be reliably used on GPU clusters for all kinds of workloads, including distributed tasks.
Here's what changed:
- Resource allocation now fully respects the user’s
resources
specification. Previously, it ignored certain aspects, especially the proper selection of GPU labels according to the specifiedgpu
spec. - Distributed tasks now fully work on Kubernetes clusters with fast interconnect enabled. Previously, this caused many issues.
- Added support
privileged
.
We’ve also published a dedicated guide on how to get started with dstack
on Kubernetes, highlighting important nuances.
Warning
Be aware of breaking changes if you used the kubernetes
backend before. The following properties in the Kubernetes backend configuration have been renamed:
networking
→proxy_jump
ssh_host
→hostname
ssh_port
→port
Additionally, the "proxy jump" pod and service names now include a dstack-
prefix.
GCP
A4 spot instances with B200 GPUs
The gcp
backend now supports A4 spot instances equipped with B200 GPUs. This includes provisioning both standalone A4 instances and A4 clusters with high-performance RoCE networking.
To use A4 clusters with high-performance networking, you must configure multiple VPCs in your backend settings (~/.dstack/server/config.yml
):
projects:
- name: main
backends:
- type: gcp
project_id: my-project
creds:
type: default
vpc_name: my-vpc-0 # regular, 1 subnet
extra_vpcs:
- my-vpc-1 # regular, 1 subnet
roce_vpcs:
- my-vpc-mrdma # RoCE profile, 8 subnets
Then, provision a cluster using a fleet configuration:
type: fleet
nodes: 2
placement: cluster
availability_zones: [us-west2-c]
backends: [gcp]
spot_policy: spot
resources:
gpu: B200:8
Each instance in the cluster will have 10 network interfaces: 1 regular interface in the main VPC, 1 regular interface in the extra VPC, and 8 RDMA interfaces in the RoCE VPC.
Note
Currently, the gcp backend only supports A4 spot instances. Support for other options, such as flex and calendar scheduling via Dynamic Workload Scheduler, is coming soon.
CLI
dstack project
is now faster
The USER
column in dstack project list
is now shown only when the --verbose
flag is used.
This significantly improves performance for users with many configured projects, reducing execution time from ~20 seconds to as little as 2 seconds in some cases.
What's changed
- [Kubernetes] Request resources according to
RequirementsSpec
by @un-def in #3127 - [GCP] Support A4 spot instances with the B200 GPU by @jvstme in #3100
- [CLI] Move
USER
todstack project list --verbose
by @jvstme in #3134 - [Kubernetes] Configure
/dev/shm
if requested by @un-def in #3135 - [Backward incompatible] Rename properties in Kubernetes backend config by @un-def in #3137
- Support GCP A4 clusters by @jvstme in #3142
- Kubernetes: add multi-node support by @un-def in #3141
- Fix duplicate server log messages by @jvstme in #3143
- [Docs] Improve Kubernetes documentation by @peterschmidt85 in #3138
Full changelog: 0.19.30...0.19.31
0.19.30
Major changes
- [Feature] Update CUDA driver in
dstack
's defaultaws
,gcp
,azure
, andoci
OS images from 535 to 570 by @jvstme in #3099
Major bug-fixes
- [Bug]
dstack
CLI logging is broken #3118 by @peterschmidt85 in #3119 - [AWS]: dstack doesn't use the EFA-enabled Docker image for H100:1 on AWS (p5.4xlarge) by @r4victor in #3111
- [Bug]
dstack
misconfigures Git credentials for private repos by @un-def in #3116
Other changes
- Fix fleet provisioning error message when fleet retried by @r4victor in #3109
- Use fleet-combined
idle_duration
on run apply by @r4victor in #3110 - Skip runner integration tests on macOS in CI by @r4victor in #3112
- [Bug]:
dstack offer
CLI grouped by GPU output as JSON fails #3120 by @peterschmidt85 in #3122
Full changelog: 0.19.29...0.19.30
0.19.29
Fleets
Over the last few releases, we’ve been reworking how fleets work to radically simplify management and make it fully declarative.
Previously, you had to specify a fleet via fleets
explicitly — otherwise, dstack
always created a new one. Now, dstack
automatically picks an existing fleet if it fits the requirements, creating a new one only when needed.
For more on the fleet roadmap, see this meta issue.
User Interface
Grouping offers by backend
The Offers
page in the UI now lets you group available offers by backend, making it easier to compare options across cloud providers.

Breaking changes
- The
tensordock
backend hasn’t worked for a long time (due to the API it relied on being deprecated) and has now been removed.
What's changed
- [Blog] The state of cloud GPUs in 2025: costs, performance, playbooks by @peterschmidt85 in #3089
- [Docs] Add
.dstack/profiles.yml
toReference
andProtips
by @peterschmidt85 in #3093 - [TensorDock] Remove the
tensordock
from supported backends #3092 by @peterschmidt85 in #3094 - Unassign scheduled run from fleet on resubmission by @r4victor in #3096
- [UI] Allow to group offers by backend by @olgenn in #3098
- Implement requirements-independent offers cache by @r4victor in #3091
- [Internal] Project config support by @peterschmidt85 in #3097
- Fix long sqlite write transaction when provisioning instances by @r4victor in #3104
- Consider backend offers when choosing optimal fleet by @r4victor in #3101
- [UI] Project wizard by @olgenn in #3103
- Use Cuda 12.0 image for DataCrunch A6000 by @r4victor in #3105
- [UI] Project wizard #323 by @olgenn in #3107
Full changelog: 0.19.28...0.19.29
0.19.28
CLI
Argument Handling
The CLI now properly handles unrecognized arguments and rejects them with clear error messages. The ${{ run.args }}
interpolation for tasks and services is still supported but now requires the --
pseudo-argument separator:
dstack apply --reuse -- --some=arg --some-option
This change prevents accidental typos in command arguments from being silently ignored.
What's Changed
- [Blog] Orchestrating GPUs on DigitalOcean and AMD Developer Cloud by @peterschmidt85 in #3075
- Forbid deleting projects with active resources by @r4victor in #3079
- Add a script to automatically generate expanded release notes using an LLM by @r4victor in #3080
- [UI] Reworked dstack Sky sign-up page by @peterschmidt85 in #3081
- [UI] Minor styling changes by @peterschmidt85 in #3082
- Generate unique fleet name for autocreated fleets by @r4victor in #3085
- Exclude current_resource.fleet by @r4victor in #3087
- [CLI] Handle unrecognized arguments by @un-def in #3076
- Add a new opt-in job network mode by @un-def in #3043
- Generate CoreModel dynamically when using custom configs by @r4victor in #3083
Full Changelog: 0.19.27...0.19.28
0.19.27
Run configurations
Repo directory
It's now possible to specify the directory in the container where the repo is mounted:
type: dev-environment
ide: vscode
repos:
- local_path: .
path: my_repo
# or using short syntax:
# - .:my_repo
The path
property can be an absolute path or a relative path (with respect to working_dir
). It's available inside run as the $DSTACK_REPO_DIR
environment variable. If path
is not set, the /workflow
path is used.
Working directory
Previously, the working_dir
property had complicated semantics: it defaulted to the repo path (/workflow
), but for tasks and services without commands
, the image working directory was used. You could also specify custom working_dir
relative to the repo directory. This is now reversed: you specify working_dir
as absolute path, and the repo path can be specified relative to it.
Note
During transitioning period, the legacy behavior of using /workflow
is preserved if working_dir
is not set. In future releases, this will be simplified, and working_dir
will always default to the image working directory.
Fleet configuration
Nodes, retry, and target
dstack
now indefinitely maintains nodes.min
specified for cloud fleets. If instances get terminated for any reason and there are fewer instances than nodes.min
, dstack
will provision new fleet instances in the background.
There is also a new nodes.target
property that specifies the number of instances to provision on fleet apply. Since now nodes.min
is always maintained, you may specify nodes.target
different from nodes.min
to provision more instances than needs to be maintained.
Example:
type: fleet
name: default-fleet
nodes:
min: 1 # Maintain one instance
target: 2 # Provision two instances initially
max: 3
dstack
will provision two instances. After deleting one instance, there will be one instances left. Deleting the last instance will trigger dstack
to re-create the instance.
Offers
The UI now has a dedicated page showing GPU offers available across all configured backends.

Digital Ocean and AMD Developer Cloud
The release adds native integration with DigitalOcean and
AMD Developer Cloud.
A backend configuration example:
projects:
- name: main
backends:
- type: amddevcloud
project_name: TestProject
creds:
type: api_key
api_key: ...
For DigitalOcean, set type
to digitalocean
.
The digitalocean
and amddevcloud
backends support NVIDIA and AMD GPU VMs, respectively, and allow you to run
dev environments (interactive development), tasks
(training, fine-tuning, or other batch jobs), and services (inference).
Security
Important
This update fixes a vulnerability in the cloudrift
, cudo
, and datacrunch
backends. Instances created with earlier dstack
versions lack proper firewall rules, potentially exposing internal APIs and allowing unauthorized access.
Users of these backends are advised to update to the latest version and re-create any running instances.
What's changed
- Minor Hot Aisle Cleanup by @Bihan in #2978
- UI for offers #3004 by @olgenn in #3042
- Add
repos[].path
property by @un-def in #3041 - style(frontend): Add missing final newline by @un-def in #3044
- Implement fleet state-spec consolidation to maintain
nodes.min
by @r4victor in #3047 - Add digital ocean and amd dev backend by @Bihan in #3030
- test: include amddevcloud and digitalocean in backend types by @Bihan in #3053
- Fix missing digitaloceanbase configurator methods by @Bihan in #3055
- Expose job working dir via environment variable by @un-def in #3049
- [runner] Ensure
working_dir
exists by @un-def in #3052 - Fix server compatibility with pre-0.19.27 runners by @un-def in #3054
- Bind shim and exposed container ports to localhost by @jvstme in #3057
- Fix client compatibility with pre-0.19.27 servers by @un-def in #3063
- [Docs] Reflect the repo and working directory changes (#3041) by @peterschmidt85 in #3064
- Show a CLI warning when using autocreated fleets by @r4victor in #3060
- Improve UX with private repos by @un-def in #3065
- Set up instance-level firewall on all backends by @jvstme in #3058
- Exclude target when equal to min for responses by @r4victor in #3070
- [Docs] Shorten the default
working_dir
warning by @peterschmidt85 in #3072 - Do not issue empty update for deleted_fleets_placement_groups by @r4victor in #3071
- Exclude target when equal to min for responses (attempt 2) by @r4victor in #3074
Full changelog: 0.19.26...0.19.27
0.19.26
Repos
Previously, dstack
always required running the dstack init
command before use. This also meant that dstack
would always mount the current folder as a repo.
With this update, repo configuration is now explicit and declarative. If you want to use a repo in your run, you must specify it with the new repos
property. The dstack init
command is now only used to provide custom Git credentials when working with private repos.
For example, imagine you have a cloned Git repo with an examples subdirectory containing a .dstack.yml
file:
type: dev-environment
name: vscode
repos:
# Mounts the parent directory of `examples` (must be a Git repo)
# to `/workflow` (the default working directory)
- ..
ide: vscode
When you run this configuration, dstack fetches
the repo on the instance, applies your local changes, and mounts it—so the container always matches your local repo.
Sometimes you may want to mount a Git repo without cloning it locally. In that case, simply provide a URL in repos:
type: dev-environment
name: vscode
repos:
# Clone the specified repo to `/workflow` (the default working directory)
- https://github.com/dstackai/dstack
ide: vscode
If the repo is private, dstack
will automatically try to use your default Git credentials (from ~/.ssh/config
or ~/.config/gh/hosts.yml
).
To configure custom Git credentials, use dstack init
.
Note
If you previously initialized a repo via dstack init
, it will still be mounted. Be sure to migrate to repos
, as implicitly configured repos are deprecated and will stop working in future releases.
If you no longer want to use the implicitly configured repo, run dstack init --remove
.
Note
Currently, you can configure only one repo per run configuration.
Fleets
Previously, when dstack
added new instances to existing fleets, it ignored the fleet configuration and used only the run configuration for which the instance was created. This could result in fleets containing instances that didn’t match their configuration.
This has now been fixed: fleet configurations and run configurations are intersected so that provisioned instances respect both. For example, given a fleet configuration:
type: fleet
name: cloud-fleet
placement: any
nodes: 0..2
backends:
- runpod
and a run configuration:
type: dev-environment
ide: vscode
spot_policy: spot
fleets:
- cloud-fleet
dstack
will provision a RunPod spot instance in cloud-fleet
.
This change lets you define main provisioning parameters in fleet configurations, while adjusting them in run configurations as needed.
Note
Currently, the run plan does not take fleet configuration into account when showing offers, since the target fleet may not be known beforehand. We plan to improve this by showing offers for all candidate fleets.
Examples
Wan2.2
We've added a new example demonstrating how to use Wan2.2, the new open-source SOTA text-to-video model, to generate videos.
Internals
Pyright integration
We now use pyright
for type checking dstack
Python code in CI. If you contribute to dstack
, we recommend you configure your IDE to use pyright
/pylance
with standard
type checking mode.
What's changed
- Fix typing issues and add pyright to CI by @r4victor in #3011
- [Internal] Update Ask AI integration ID by @olgenn in #3009
- Make Configurator generic by @r4victor in #3013
- Type check cli.commands by @r4victor in #3014
- [Docs] Improve the docs regarding
dstack init
and repos to reflect the recent changes. by @peterschmidt85 in #3015 - Respect fleet spec when provisioning on run apply by @r4victor in #3022
- Consider elastic busy fleets for provisioning by @r4victor in #3024
- Fix duplicate instance_num by @r4victor in #3025
- Add declarative repo configuration by @un-def in #3023
- Allow gpu.name as string in json schema by @r4victor in #3027
- [Bug]: nebius.aio.service_error.RequestError: Request error DEADLINE_EXCEEDED: Deadline Exceeded #2962 by @peterschmidt85 in #3028
- Fix DataCrunchCompute exception when terminating already removed instance by @r4victor in #3032
- [DataCrunch] Ensure dstack is using fixed pricing #3033 by @peterschmidt85 in #3034
- Document
repos
by @peterschmidt85 in #3026 - Add Wan2.2 example by @r4victor in #3029
- Automatically remove dangling tasks from shim by @jvstme in #3036
dstack offer
fixes by @peterschmidt85 in #3038- Remove dstack init from help by @r4victor in #3039
Full changelog: 0.19.25...0.19.26
0.19.25
CLI
dstack offer --group-by
The dstack offer
command can now display aggregated information about available offers. For example, to see what GPUs are available in different clouds, use --group-by gpu
.
> dstack offer --group-by gpu
# GPU SPOT $/GPU BACKENDS
1 T4:16GB:1..8 spot, on-demand 0.1037..1.3797 gcp, aws
2 L4:24GB:1..8 spot, on-demand 0.1829..2.1183 gcp, aws
3 P100:16GB:1..4 spot, on-demand 0.2115..2.4043 gcp, oci
4 V100:16GB:1..8 spot, on-demand 0.3152..4.234 gcp, aws, oci, lambda
5 A10G:22GB:1..8 spot, on-demand 0.3623..2.5845 aws
6 L40S:44GB:1..8 spot, on-demand 0.6392..4.7095 aws
7 A100:40GB:1..16 spot, on-demand 0.6441..4.0496 gcp, aws, oci, lambda
8 A10:24GB:1..4 on-demand 0.75..2 oci, lambda
9 H100:80GB:1..8 spot, on-demand 1.079..15.7236 gcp, aws, lambda
10 A100:80GB:1..8 spot, on-demand 1.2942..5.7077 gcp, aws, lambda
Refer to the docs for information about the available aggregations.
Deprecations
- Local repos are now deprecated. If you need to deliver a local directory or file to a run, use
files
instead. If the run doesn't require a repo, usedstack apply --no-repo
. Remote repos remain the recommended way to deliver Git repos to runs.
What's changed
- Document Deployment-compatible migrations by @r4victor in #2987
- [Bug]: Server Docker image fails because of Unable to locate package … by @peterschmidt85 in #2983
- Only register service replicas after probes pass by @jvstme in #2986
- [Changelog] Introducing service probes by @peterschmidt85 in #2988
- Deprecate local repos by @un-def in #2984
- Support elastic fleets by @r4victor in #2967
- fix typo config.yml.md by @jspablo in #2991
- Check if kapa.ai can also be integrated into dstack Sky #296 by @olgenn in #2990
- Typo in URLs by @mashcroft3 in #2995
- [shim] Fix
DCGMWrapperInterface
nil check (bis) by @un-def in #3001 - The logs section is too short in the UI by @olgenn in #2989
- [Feature]: Allow
dstack offer
to aggregate GPU information by @peterschmidt85 in #2992 - [Internal]: CI refactoring by @jvstme in #3006
- Update examples by @un-def in #3007
- Minor CLI fixes by @peterschmidt85 in #3008
New Contributors
- @mashcroft3 made their first contribution in #2995
Full Changelog: 0.19.24...0.19.25
0.19.24
Migration guide
Warning
This update requires stopping all dstack
server replicas before deploying, due to database schema changes.
Make sure no replicas from the previous version and the new version run at the same time.
What's changed
- [Internal] Replace enums with strings in the DB,
JobSubmission.termination_reason
, andRun.termination_reason
by @r4victor in #2949 - [Internal] Fix macOS build for shim by @un-def in #2958
- [Bug] Increase the secrets max character length by @james-boydell in #2971
- [Internal] Introduce
InstanceAvailability.NO_BALANCE
(for external integrations) by @peterschmidt85 in #2975 - [Bug]: Cannot manage secrets in UI as project admin by @olgenn in #2972
- [Bug] Fix
DCGMWrapperInterface
nil check in shim by @un-def in #2980
Full changelog: 0.19.23...0.19.24
0.19.23
Major bug-fixes
- This release resolves an issue introduced in 0.19.22 that caused instance provisioning to fail consistently for certain instance types.
Backends
Nebius
The nebius
backend now supports spot instances and the NVIDIA B200 GPU.
> dstack offer -b nebius --spot
# BACKEND RESOURCES PRICE
1 nebius (eu-north1) cpu=16 mem=200GB disk=100GB H100:80GB:1 (spot) $1.25
2 nebius (eu-north1) cpu=16 mem=200GB disk=100GB H200:141GB:1 (spot) $1.45
3 nebius (eu-west1) cpu=16 mem=200GB disk=100GB H200:141GB:1 (spot) $1.45
4 nebius (us-central1) cpu=16 mem=200GB disk=100GB H200:141GB:1 (spot) $1.45
5 nebius (eu-north1) cpu=128 mem=1600GB disk=100GB H100:80GB:8 (spot) $10
6 nebius (eu-north1) cpu=128 mem=1600GB disk=100GB H200:141GB:8 (spot) $11.6
7 nebius (eu-west1) cpu=128 mem=1600GB disk=100GB H200:141GB:8 (spot) $11.6
8 nebius (us-central1) cpu=128 mem=1600GB disk=100GB H200:141GB:8 (spot) $11.6
> dstack offer -b nebius --gpu 8:b200
# BACKEND RESOURCES PRICE
1 nebius (us-central1) cpu=160 mem=1792GB disk=100GB B200:180GB:8 $44
What's changed
- Fix
dstack-shim
release build by @jvstme in #2964 - [Nebius] Support spot instances and B200 by @peterschmidt85 in #2965
Full Changelog: 0.19.22...0.19.23