Multi-arch (amd64 + arm64) Docker images for running vLLM with CPU backend.
Built automatically from upstream tags and commits.
nightly→ current commit ofmain<version>(e.g.0.10.2) → official vLLM release tags<commit-hash>→ alternative tag for reproducibilityamd64-*,arm64-*→ per-arch buildslatest→ alias to latest release
Run an API server:
docker run --rm -p 8000:8000 \
-e HUGGING_FACE_HUB_TOKEN=\$HF_TOKEN \
gabrielbico/vllm-cpu:nightly \
--model google/gemma-3-270mlinux/amd64(built on AVX2 hosts)linux/arm64(Apple Silicon, ARM servers)
These images are CPU only. For GPU builds, see the official vLLM docs.