Description
I have docker running on my dgx spark.
I tried pulling an image, it failed after downloading most of the layers, so I tried pulling again, and it started over.
I have confirmed that I had >200Gb free disk space in addition to the 30Gb download.
I have confirmed that the disk space was released after each failure.
The logs confirm that docker explicitly cleaned up the downloaded layers.
I was not given the option of keeping the partial state and trying again.
After a couple of tries I set { "max-concurrent-downloads": 1 } in /etc/docker/daemon.json. This changed the downloading behaviour but did not solve the problem.
Conclusion: 4 or 5 tries. ~2h each time. Many, many GB of download bandwidth consumed for nothing.
Suffice to say this is VERY ANNOYING.
In the end chatgpt helped me use ctr image pull then export and import to docker, but I shouldn't need to faff around like that.
Reproduce
- I ran docker pull ghcr.io/aeon-7/aeon-vllm-ultimate:latest
- It failed with "unexpected EOF" when it has nearly completed.
- I ran docker pull .. again
- It started downloading everything again.
Expected behavior
It should not delete everything, it should acknowledge the failure, and tell me to either retry or run docker image prune or something.
docker version
Client: Docker Engine - Community
Version: 29.2.1
API version: 1.53
Go version: go1.25.6
Git commit: a5c7197
Built: Mon Feb 2 17:16:40 2026
OS/Arch: linux/arm64
Context: default
Server: Docker Engine - Community
Engine:
Version: 29.2.1
API version: 1.53 (minimum version 1.44)
Go version: go1.25.6
Git commit: 6bc6209
Built: Mon Feb 2 17:16:40 2026
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: v2.2.1
GitCommit: dea7da592f5d1d2b7755e3a161be07f43fad8f75
runc:
Version: 1.3.4
GitCommit: v1.3.4-0-gd6d73eb8
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client: Docker Engine - Community
Version: 29.2.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.31.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v5.0.2
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 3
Running: 1
Paused: 0
Stopped: 2
Images: 7
Server Version: 29.2.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: inactive
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: dea7da592f5d1d2b7755e3a161be07f43fad8f75
runc version: v1.3.4-0-gd6d73eb8
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.17.0-1021-nvidia
Operating System: Ubuntu 24.04.4 LTS
OSType: linux
Architecture: aarch64
CPUs: 20
Total Memory: 121.6GiB
Name: gx10-d54e
ID: c69e21f0-503c-474d-9bd7-a31a7473f20d
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
::1/128
127.0.0.0/8
Live Restore Enabled: false
Firewall Backend: iptables
Additional Info
Here is an extract of the logs
juin 24 20:51:05 gx10-1234 systemd[1]: Started docker.service - Docker Application Container Engine.
juin 24 21:00:05 gx10-1234 dockerd[2496204]: time="2026-06-24T21:00:05.223700332+02:00" level=info msg="Download failed, retrying (1/5): unexpected EOF"
juin 24 21:15:05 gx10-1234 dockerd[2496204]: time="2026-06-24T21:15:05.931603647+02:00" level=info msg="Download failed, retrying (1/5): unexpected EOF"
juin 24 21:25:02 gx10-1234 dockerd[2496204]: time="2026-06-24T21:25:02.686214714+02:00" level=info msg="Download failed, retrying (2/5): unexpected EOF"
juin 24 21:35:11 gx10-1234 dockerd[2496204]: time="2026-06-24T21:35:11.913484645+02:00" level=info msg="Download failed, retrying (3/5): unexpected EOF"
juin 24 21:50:03 gx10-1234 dockerd[2496204]: time="2026-06-24T21:50:03.282259340+02:00" level=info msg="Download failed, retrying (1/5): unexpected EOF"
juin 24 22:20:04 gx10-1234 dockerd[2496204]: time="2026-06-24T22:20:04.234511850+02:00" level=info msg="Download failed, retrying (1/5): unexpected EOF"
juin 24 22:31:54 gx10-1234 dockerd[2496204]: time="2026-06-24T22:31:54.880492102+02:00" level=info msg="Download failed, retrying (1/5): unexpected EOF"
juin 24 22:40:02 gx10-1234 dockerd[2496204]: time="2026-06-24T22:40:02.151983720+02:00" level=info msg="Download failed, retrying (2/5): unexpected EOF"
juin 24 22:50:02 gx10-1234 dockerd[2496204]: time="2026-06-24T22:50:02.204764694+02:00" level=info msg="Download failed, retrying (3/5): unexpected EOF"
juin 24 23:00:02 gx10-1234 dockerd[2496204]: time="2026-06-24T23:00:02.472899884+02:00" level=info msg="Download failed, retrying (4/5): unexpected EOF"
juin 24 23:10:03 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:03.735025849+02:00" level=error msg="Download failed after 5 attempts: unexpected EOF"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.315774989+02:00" level=info msg="Attempting next endpoint for pull after error: unexpected EOF"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.316426956+02:00" level=info msg="Cleaned up layer sha256:7f8b21540c9b7c5151bd4b6ac89fde3241dcfebfab7c453f8179aa360c9dddc7" chainID="sha256:7f8b21540c9b7c5151bd4b6ac89fde3241dcfebfab7c453f8179aa360c9dddc7"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.317013259+02:00" level=info msg="Cleaned up layer sha256:9b487e5a17f99f6fed92264864938a33d4b3e6b37298053dad8b67db40a2220e" chainID="sha256:9b487e5a17f99f6fed92264864938a33d4b3e6b37298053dad8b67db40a2220e"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.382075704+02:00" level=info msg="Cleaned up layer sha256:fddae301f9a3423f9f6fe4dacf46cadae8af3661f56fde71d295d07bc7822d36" chainID="sha256:fddae301f9a3423f9f6fe4dacf46cadae8af3661f56fde71d295d07bc7822d36"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.382482776+02:00" level=info msg="Cleaned up layer sha256:37bdd1da697d9a0e1053eecba447f274c46cf6bb5e5c8e02bb62273735b592ac" chainID="sha256:37bdd1da697d9a0e1053eecba447f274c46cf6bb5e5c8e02bb62273735b592ac"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.382719943+02:00" level=info msg="Cleaned up layer sha256:18ab77fbd6dce76c09ce2a98d1bf0b47d94c5b03a2ae0fbcff9044eade2b811c" chainID="sha256:18ab77fbd6dce76c09ce2a98d1bf0b47d94c5b03a2ae0fbcff9044eade2b811c"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.511828388+02:00" level=info msg="Cleaned up layer sha256:588120926b3c3923ce2baeb01070c35dac84e7d624459c30d10e6dceafaf55a6" chainID="sha256:588120926b3c3923ce2baeb01070c35dac84e7d624459c30d10e6dceafaf55a6"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.512253699+02:00" level=info msg="Cleaned up layer sha256:5dfecb6e3b11d1ce8b20f71d7be2d1ead9f7cbf27b5e1b199f42bfb6e77228f9" chainID="sha256:5dfecb6e3b11d1ce8b20f71d7be2d1ead9f7cbf27b5e1b199f42bfb6e77228f9"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.526950457+02:00" level=info msg="Cleaned up layer sha256:354bf67f5ac45455fe6ba75f543f54384d6467465882d28a1666f4aede5b6e48" chainID="sha256:354bf67f5ac45455fe6ba75f543f54384d6467465882d28a1666f4aede5b6e48"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.527293273+02:00" level=info msg="Cleaned up layer sha256:f60bc58a16563049217be9dd315f12478f6935383a98dc0fd91ce69dd97c6c8e" chainID="sha256:f60bc58a16563049217be9dd315f12478f6935383a98dc0fd91ce69dd97c6c8e"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.686775040+02:00" level=info msg="Cleaned up layer sha256:3e9dd92cb928a4dd18f03a92a7ead8467e880619c16b72cdc8f3b09ec4a4b469" chainID="sha256:3e9dd92cb928a4dd18f03a92a7ead8467e880619c16b72cdc8f3b09ec4a4b469"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.687188799+02:00" level=info msg="Cleaned up layer sha256:1bca040490000c5ab406be29cb512fbc8360431b9fb5c3f95210293602106e85" chainID="sha256:1bca040490000c5ab406be29cb512fbc8360431b9fb5c3f95210293602106e85"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.688603293+02:00" level=info msg="Cleaned up layer sha256:987f04f7e32e3c103525b73c5eda5d0b62220e399579a7fd375d647a328cc5a0" chainID="sha256:987f04f7e32e3c103525b73c5eda5d0b62220e399579a7fd375d647a328cc5a0"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.695358353+02:00" level=info msg="Cleaned up layer sha256:a5195778d451f979e2c879016720a580d8159db539e4d1823d2c6b0080c8a338" chainID="sha256:a5195778d451f979e2c879016720a580d8159db539e4d1823d2c6b0080c8a338"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.695368769+02:00" level=info msg="Cleaned up layer sha256:379ac51529be0216024142fb2af0a2d6ce57314739261b84c5df0094419a9529" chainID="sha256:379ac51529be0216024142fb2af0a2d6ce57314739261b84c5df0094419a9529"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.695638449+02:00" level=info msg="Cleaned up layer sha256:1a1d70b3b847a87a2a7d6a9d5d8b6f51932257800db2cb81e9bb3de91f26ebef" chainID="sha256:1a1d70b3b847a87a2a7d6a9d5d8b6f51932257800db2cb81e9bb3de91f26ebef"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.789593851+02:00" level=info msg="Cleaned up layer sha256:edde9bd485f2d6e48fc36d331ded706877774afe3f074ef1f3c23cbc696f2751" chainID="sha256:edde9bd485f2d6e48fc36d331ded706877774afe3f074ef1f3c23cbc696f2751"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.789987915+02:00" level=info msg="Cleaned up layer sha256:41dc0ad51bdb529b32b6a052421b565f8cfe3f088ccf8bf29ef42f0b89ed830c" chainID="sha256:41dc0ad51bdb529b32b6a052421b565f8cfe3f088ccf8bf29ef42f0b89ed830c"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.922396113+02:00" level=info msg="Cleaned up layer sha256:95ae8b4b24ad3dc781076cd9e2ba527ef95177369c753f81441a5a471a0f8bcc" chainID="sha256:95ae8b4b24ad3dc781076cd9e2ba527ef95177369c753f81441a5a471a0f8bcc"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.924094686+02:00" level=info msg="Cleaned up layer sha256:36b3968c905ca56bc018d383e86d9335b6b872a1d8763f8e7052aaf1f507f1c5" chainID="sha256:36b3968c905ca56bc018d383e86d9335b6b872a1d8763f8e7052aaf1f507f1c5"
juin 24 23:10:04 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:04.924104862+02:00" level=info msg="Cleaned up layer sha256:982e05793ec0cd5b5c31d24b92c5568e825c43fcee38b20413d4849d56a2d44e" chainID="sha256:982e05793ec0cd5b5c31d24b92c5568e825c43fcee38b20413d4849d56a2d44e"
juin 24 23:10:05 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:05.847137699+02:00" level=info msg="Cleaned up layer sha256:6e41a44df3614217dfd2d2c90049feeec9ca69e8d7c21fad4a8a038b6f387b46" chainID="sha256:6e41a44df3614217dfd2d2c90049feeec9ca69e8d7c21fad4a8a038b6f387b46"
juin 24 23:10:06 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:06.851815747+02:00" level=info msg="Cleaned up layer sha256:9e88a345096e56d7e5e5e99a4e190beb9fe40be62096166bdf1e39786ba755d2" chainID="sha256:9e88a345096e56d7e5e5e99a4e190beb9fe40be62096166bdf1e39786ba755d2"
juin 24 23:10:06 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:06.851838323+02:00" level=info msg="Cleaned up layer sha256:5b59426e3bf42b5d40f63b2d9b6866ae55be2801c39fc693aeae180290f2ef3b" chainID="sha256:5b59426e3bf42b5d40f63b2d9b6866ae55be2801c39fc693aeae180290f2ef3b"
juin 24 23:10:06 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:06.859746325+02:00" level=info msg="Cleaned up layer sha256:bc4717bc8e455239ed880c574d6a3d396793eee6140cfde7c8a0801c6a86f850" chainID="sha256:bc4717bc8e455239ed880c574d6a3d396793eee6140cfde7c8a0801c6a86f850"
juin 24 23:10:06 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:06.859762597+02:00" level=info msg="Cleaned up layer sha256:f051ab59f82595a284805651ae54c6ccfafb58e9ed105ff93fa5b369ebaad947" chainID="sha256:f051ab59f82595a284805651ae54c6ccfafb58e9ed105ff93fa5b369ebaad947"
juin 24 23:10:07 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:07.296838063+02:00" level=info msg="Cleaned up layer sha256:0d431d2c842c9dba3cef770924246b13df1fa2bc633c965131aad7403fbfc8de" chainID="sha256:0d431d2c842c9dba3cef770924246b13df1fa2bc633c965131aad7403fbfc8de"
juin 24 23:10:07 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:07.296863679+02:00" level=info msg="Cleaned up layer sha256:f490582328c6b340cd93e460cf9b5556ef15915d88722487af09d9576725aad3" chainID="sha256:f490582328c6b340cd93e460cf9b5556ef15915d88722487af09d9576725aad3"
juin 24 23:10:07 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:07.296868143+02:00" level=info msg="Cleaned up layer sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92" chainID="sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92"
juin 24 23:10:07 gx10-1234 dockerd[2496204]: time="2026-06-24T23:10:07.296872607+02:00" level=info msg="Cleaned up layer sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790" chainID="sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790"
juin 24 23:29:05 gx10-1234 dockerd[2496204]: time="2026-06-24T23:29:05.084857336+02:00" level=info msg="Not continuing with pull after error" error="context canceled"
juin 24 23:29:05 gx10-1234 dockerd[2496204]: time="2026-06-24T23:29:05.085400502+02:00" level=info msg="Cleaned up layer sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92" chainID="sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92"
juin 24 23:29:05 gx10-1234 dockerd[2496204]: time="2026-06-24T23:29:05.085679205+02:00" level=info msg="Cleaned up layer sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790" chainID="sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790"
juin 24 23:33:49 gx10-1234 dockerd[2496204]: time="2026-06-24T23:33:49.760713421+02:00" level=info msg="Not continuing with pull after error" error="context canceled"
juin 24 23:33:49 gx10-1234 dockerd[2496204]: time="2026-06-24T23:33:49.761425090+02:00" level=info msg="Cleaned up layer sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92" chainID="sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92"
juin 24 23:33:49 gx10-1234 dockerd[2496204]: time="2026-06-24T23:33:49.761950565+02:00" level=info msg="Cleaned up layer sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790" chainID="sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790"
juin 25 05:22:47 gx10-1234 dockerd[2496204]: time="2026-06-25T05:22:47.285206643+02:00" level=info msg="Not continuing with pull after error" error="context canceled"
juin 25 05:22:47 gx10-1234 dockerd[2496204]: time="2026-06-25T05:22:47.285723568+02:00" level=info msg="Cleaned up layer sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92" chainID="sha256:c3543f890e345e87a61bb44eb34082ea2320c68929ed3cd67d7469091db34d92"
juin 25 05:22:47 gx10-1234 dockerd[2496204]: time="2026-06-25T05:22:47.287296311+02:00" level=info msg="Cleaned up layer sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790" chainID="sha256:1d70db596aca6afcb88096610ae3d3b2d9222d85447325707570fc8372bcf790"
Description
I have docker running on my dgx spark.
I tried pulling an image, it failed after downloading most of the layers, so I tried pulling again, and it started over.
I have confirmed that I had >200Gb free disk space in addition to the 30Gb download.
I have confirmed that the disk space was released after each failure.
The logs confirm that docker explicitly cleaned up the downloaded layers.
I was not given the option of keeping the partial state and trying again.
After a couple of tries I set
{ "max-concurrent-downloads": 1 }in/etc/docker/daemon.json. This changed the downloading behaviour but did not solve the problem.Conclusion: 4 or 5 tries. ~2h each time. Many, many GB of download bandwidth consumed for nothing.
Suffice to say this is VERY ANNOYING.
In the end chatgpt helped me use
ctr image pullthen export and import to docker, but I shouldn't need to faff around like that.Reproduce
Expected behavior
It should not delete everything, it should acknowledge the failure, and tell me to either retry or run docker image prune or something.
docker version
Client: Docker Engine - Community Version: 29.2.1 API version: 1.53 Go version: go1.25.6 Git commit: a5c7197 Built: Mon Feb 2 17:16:40 2026 OS/Arch: linux/arm64 Context: default Server: Docker Engine - Community Engine: Version: 29.2.1 API version: 1.53 (minimum version 1.44) Go version: go1.25.6 Git commit: 6bc6209 Built: Mon Feb 2 17:16:40 2026 OS/Arch: linux/arm64 Experimental: false containerd: Version: v2.2.1 GitCommit: dea7da592f5d1d2b7755e3a161be07f43fad8f75 runc: Version: 1.3.4 GitCommit: v1.3.4-0-gd6d73eb8 docker-init: Version: 0.19.0 GitCommit: de40ad0docker info
Additional Info
Here is an extract of the logs