Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

DragonStuff
Copy link
Contributor

The SRE team at TableCheck is attempting to deploy dstack for our GPU workloads, but we are currently blocked due to an issue we have identified with EKS Pod Identity (OIDC) support and Litestream.

This pull request in Litestream adds support benbjohnson/litestream#683, but the version of litestream used in dstack's Docker images is quite old (v0.3.9) and doesn't include this support.

As a result, the Docker image will start with the following error from Litestream:

cannot fetch generations: cannot lookup bucket region: NoCredentialProviders: no valid providers in chain. Deprecated.
	For verbose messaging see aws.Config.CredentialsChainVerboseErrors

This pull request updates the version of Litestream to the latest version available in https://github.com/benbjohnson/litestream/releases.

@peterschmidt85
Copy link
Contributor

@DragonStuff @sanbyk Thank you very much for the PR.

Of course I will review this PR shortly, but just want to confirm, do you plan to actually use dstack with Litestream?
I'm asking because most users of dstack use PostgreSQL instead for production deployment. We would strongly encourage you to use PostgreSQL instead.

Copy link
Contributor

@peterschmidt85 peterschmidt85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments.

You can quickly validate if the configuration build is valid by running, for example:

docker build --build-arg VERSION=0.19.31 --load -f release/Dockerfile .

from docker/server (and -f stgn/Dockerfile for staging version)

DragonStuff and others added 2 commits October 8, 2025 20:50
Co-authored-by: Andrey Cheptsov <[email protected]>
Co-authored-by: Andrey Cheptsov <[email protected]>
@DragonStuff
Copy link
Contributor Author

@peterschmidt85 Incredible, you are so fast at reviewing, I apologize as I was just about to test and I think I inconvenienced you with my typo.

Thank you very much for fixing my PR up!

Looks like it builds successfully, and I'm going to run this image on our infra as well to test.

Details

server % docker build --build-arg VERSION=0.19.31 -f release/Dockerfile .
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

Sending build context to Docker daemon  9.216kB
Step 1/16 : FROM python:3.11-slim
 ---> 1e4c6e8dc37c
Step 2/16 : ARG VERSION
 ---> Using cache
 ---> ce21b21bb9f4
Step 3/16 : ENV VERSION=$VERSION
 ---> Using cache
 ---> 2ea473f9f089
Step 4/16 : ENV PYTHONUNBUFFERED=1
 ---> Using cache
 ---> 3e26a82feab2
Step 5/16 : ENV DSTACK_SERVER_LOG_FORMAT=json
 ---> Using cache
 ---> 311982036833
Step 6/16 : WORKDIR /dstack-server
 ---> Using cache
 ---> d124a78c2879
Step 7/16 : RUN apt-get update && apt-get install -y     curl     git     sqlite3     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 7c87af73d525
Step 8/16 : RUN if [ $(uname -m) = "aarch64" ]; then ARCH="arm64"; else ARCH="amd64"; fi &&     curl https://github.com/benbjohnson/litestream/releases/download/v0.5.0/litestream-0.5.0-linux-$ARCH.deb -O -L &&     dpkg -i litestream-0.5.0-linux-$ARCH.deb
 ---> Running in 4b2a8006693c
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 10.7M  100 10.7M    0     0  10.4M      0  0:00:01  0:00:01 --:--:-- 14.5M
Selecting previously unselected package litestream.
(Reading database ... 10739 files and directories currently installed.)
Preparing to unpack litestream-0.5.0-linux-arm64.deb ...
Unpacking litestream (0.5.0) ...
Setting up litestream (0.5.0) ...
 ---> Removed intermediate container 4b2a8006693c
 ---> 8644d8b9b506
Step 9/16 : ADD https://astral.sh/uv/install.sh /uv-installer.sh


 ---> 461fe7d19372
Step 10/16 : RUN sh /uv-installer.sh && rm /uv-installer.sh
 ---> Running in b9bd56424f5d
downloading uv 0.9.0 aarch64-unknown-linux-gnu
no checksums to verify
installing to /root/.local/bin
  uv
  uvx
everything's installed!

To add $HOME/.local/bin to your PATH, either restart your shell or run:

    source $HOME/.local/bin/env (sh, bash, zsh)
    source $HOME/.local/bin/env.fish (fish)
 ---> Removed intermediate container b9bd56424f5d
 ---> 4ab131e87fa1
Step 11/16 : ENV PATH="/root/.local/bin/:$PATH"
 ---> Running in c35944de00f6
 ---> Removed intermediate container c35944de00f6
 ---> 1c306e681508
Step 12/16 : RUN uv tool install "dstack[all]==$VERSION"
 ---> Running in 6477fed8b088
Resolved 151 packages in 2.36s
Downloading botocore (13.4MiB)
Downloading pygments (1.2MiB)
Downloading google-api-python-client (13.6MiB)
Downloading cryptography (3.8MiB)
Downloading sqlalchemy (3.1MiB)
Downloading aiohttp (1.7MiB)
Downloading grpcio (6.0MiB)
Downloading kubernetes (1.9MiB)
Downloading uvloop (3.8MiB)
Downloading azure-mgmt-authorization (1.0MiB)
Downloading oci (30.8MiB)
Downloading google-cloud-compute (3.3MiB)
Downloading azure-mgmt-compute (1.5MiB)
Downloading asyncpg (2.9MiB)
Downloading dstack (15.1MiB)
Downloading azure-mgmt-resource (3.4MiB)
   Building cursor==1.3.5
   Building www-authenticate==0.9.2
 Downloading pygments
      Built www-authenticate==0.9.2
      Built cursor==1.3.5
 Downloading aiohttp
 Downloading azure-mgmt-authorization
 Downloading asyncpg
 Downloading azure-mgmt-compute
 Downloading sqlalchemy
 Downloading kubernetes
 Downloading google-cloud-compute
 Downloading uvloop
 Downloading cryptography
 Downloading grpcio
 Downloading azure-mgmt-resource
 Downloading google-api-python-client
 Downloading botocore
 Downloading oci
 Downloading dstack
Prepared 151 packages in 5.78s
Installed 151 packages in 186ms
 + aiocache==0.12.3
 + aiohappyeyeballs==2.6.1
 + aiohttp==3.13.0
 + aiorwlock==1.5.0
 + aiosignal==1.4.0
 + aiosqlite==0.21.0
 + alembic==1.16.5
 + alembic-postgresql-enum==1.8.0
 + anyio==4.11.0
 + apscheduler==3.11.0
 + argcomplete==3.6.2
 + asyncpg==0.30.0
 + attrs==25.4.0
 + azure-common==1.1.28
 + azure-core==1.35.1
 + azure-identity==1.25.1
 + azure-mgmt-authorization==4.0.0
 + azure-mgmt-compute==37.0.0
 + azure-mgmt-core==1.6.0
 + azure-mgmt-msi==7.1.0
 + azure-mgmt-network==27.0.0
 + azure-mgmt-resource==24.0.0
 + azure-mgmt-subscription==3.1.1
 + backports-entry-points-selectable==1.3.0
 + bcrypt==5.0.0
 + boto3==1.40.47
 + botocore==1.40.47
 + cached-classproperty==1.1.0
 + cachetools==6.2.0
 + certifi==2025.10.5
 + cffi==2.0.0
 + charset-normalizer==3.4.3
 + circuitbreaker==2.1.3
 + click==8.3.0
 + cryptography==44.0.3
 + cursor==1.3.5
 + dataclasses-json==0.6.7
 + datacrunch==1.14.0
 + docker==7.1.0
 + dstack==0.19.31
 + durationpy==0.10
 + fastapi==0.118.1
 + filelock==3.19.1
 + frozenlist==1.8.0
 + gitdb==4.0.12
 + gitpython==3.1.45
 + google-api-core==2.25.2
 + google-api-python-client==2.184.0
 + google-auth==2.41.1
 + google-auth-httplib2==0.2.0
 + google-cloud-appengine-logging==1.6.2
 + google-cloud-audit-log==0.3.3
 + google-cloud-billing==1.16.3
 + google-cloud-compute==1.39.0
 + google-cloud-core==2.4.3
 + google-cloud-logging==3.12.1
 + google-cloud-storage==3.4.0
 + google-cloud-tpu==1.23.2
 + google-crc32c==1.7.1
 + google-resumable-media==2.7.2
 + googleapis-common-protos==1.70.0
 + gpuhunt==0.1.8
 + greenlet==3.2.4
 + grpc-google-iam-v1==0.14.2
 + grpcio==1.75.1
 + grpcio-status==1.75.1
 + h11==0.16.0
 + httpcore==1.0.9
 + httplib2==0.31.0
 + httptools==0.6.4
 + httpx==0.28.1
 + idna==3.10
 + ignore-python==0.3.0
 + importlib-metadata==8.7.0
 + invoke==2.2.0
 + isodate==0.7.2
 + jinja2==3.1.6
 + jmespath==1.0.1
 + jsonschema==4.25.1
 + jsonschema-specifications==2025.9.1
 + kubernetes==34.1.0
 + mako==1.3.10
 + markdown-it-py==4.0.0
 + markupsafe==3.0.3
 + marshmallow==3.26.1
 + mdurl==0.1.2
 + msal==1.34.0
 + msal-extensions==1.3.1
 + msrest==0.7.1
 + multidict==6.7.0
 + mypy-extensions==1.1.0
 + nebius==0.2.72
 + oauthlib==3.3.1
 + oci==2.161.0
 + opentelemetry-api==1.37.0
 + orjson==3.11.3
 + packaging==25.0
 + paramiko==4.0.0
 + portalocker==3.2.0
 + prometheus-client==0.23.1
 + propcache==0.4.0
 + proto-plus==1.26.1
 + protobuf==6.32.1
 + psutil==7.1.0
 + pyasn1==0.6.1
 + pyasn1-modules==0.4.2
 + pycparser==2.23
 + pydantic==1.10.24
 + pydantic-duality==1.2.4
 + pygments==2.19.2
 + pyjwt==2.10.1
 + pynacl==1.6.0
 + pyopenssl==24.3.0
 + pyparsing==3.2.5
 + python-dateutil==2.9.0.post0
 + python-dotenv==1.1.1
 + python-dxf==12.1.0
 + python-json-logger==4.0.0
 + python-multipart==0.0.20
 + pytz==2025.2
 + pyyaml==6.0.3
 + referencing==0.36.2
 + requests==2.32.5
 + requests-oauthlib==2.0.0
 + rich==14.1.0
 + rich-argparse==1.7.1
 + rpds-py==0.27.1
 + rsa==4.9.1
 + s3transfer==0.14.0
 + sentry-sdk==2.40.0
 + simple-term-menu==1.6.6
 + six==1.17.0
 + smmap==5.0.2
 + sniffio==1.3.1
 + sqlalchemy==2.0.43
 + sqlalchemy-utils==0.42.0
 + starlette==0.48.0
 + tqdm==4.67.1
 + typing-extensions==4.15.0
 + typing-inspect==0.9.0
 + tzlocal==5.3.1
 + uritemplate==4.2.0
 + urllib3==2.3.0
 + uvicorn==0.37.0
 + uvloop==0.21.0
 + watchfiles==1.1.0
 + websocket-client==1.9.0
 + websockets==15.0.1
 + www-authenticate==0.9.2
 + yarl==1.22.0
 + zipp==3.23.0
Installed 1 executable: dstack
 ---> Removed intermediate container 6477fed8b088
 ---> 25459c0f6e9f
Step 13/16 : COPY entrypoint.sh entrypoint.sh
 ---> 79f0227df939
Step 14/16 : RUN chmod 777 entrypoint.sh
 ---> Running in 949899de5366
 ---> Removed intermediate container 949899de5366
 ---> 9a038c603907
Step 15/16 : EXPOSE 3000
 ---> Running in c3d383c24a40
 ---> Removed intermediate container c3d383c24a40
 ---> 15d28b11bf5c
Step 16/16 : ENTRYPOINT ["./entrypoint.sh"]
 ---> Running in 7799e900076c
 ---> Removed intermediate container 7799e900076c
 ---> 7ab621b98397
Successfully built 7ab621b98397

@peterschmidt85
Copy link
Contributor

peterschmidt85 commented Oct 8, 2025

@DragonStuff

  1. If one tries to run the new image with Litestream enabled, it will hit 0.5.0: flag provided but not defined: -if-replica-exists benbjohnson/litestream#774 (I suppose the error was introduced in 0.5.0)
  2. Please see my question above feat(docker): upgrade litestream to v0.5.0 #3165 (comment) as it's quite important.

@DragonStuff
Copy link
Contributor Author

@peterschmidt85 Good catch. After working with Litestream on another project, it has a few caveats and is definitely still pre-v1. Perhaps going down the route of using PostgreSQL is a good option, but I was thinking it would be really nice to not have another PostgreSQL database to manage.

Don't worry, I totally understand the risks of basically having a single server for dstack, and after we trial it internally, we'll move to the PostgreSQL option.

I've written a short bash script to implement the -if-replica-exists flag functionality which seems to work quite well.

image

After deleting the container and restarting, it comes up correctly as well.

Screenshot 2025-10-08 at 22 21 50

Perhaps in a follow-up PR it would be good to add additional arguments to allow Litestream args to be set from environment variables.

Copy link
Contributor

@peterschmidt85 peterschmidt85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, it looks good. @r4victor please have a look too just in case!

@peterschmidt85 peterschmidt85 merged commit 7c7ed7f into dstackai:master Oct 8, 2025
@peterschmidt85
Copy link
Contributor

@DragonStuff Merged. Please give it a thorough test, and let us know if all is good!

@DragonStuff
Copy link
Contributor Author

@DragonStuff Merged. Please give it a thorough test, and let us know if all is good!

@peterschmidt85, we did some testing overnight with a service running, with the dstack deployment on spot instances to make sure it gets interrupted a few times and everything seems to work as expected.

We will keep this running for our staging environment and if we encounter any issues, I’ll make sure to make an issue and PR to fix it.

Thanks again for your prompt assistance, dstack is awesome! 🥂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants