[CD] Deprecate CUDA 12.8 builds in favor of CUDA 13.0#179072
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/179072
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 31 Pending, 1 Unrelated FailureAs of commit b130b51 with merge base a74f52b ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
8d54c4b to
6dd352f
Compare
Remove CUDA 12.8 from the binary build matrix and regenerate nightly workflows. CUDA 13.0 is already the stable version, making 12.8 redundant.
6dd352f to
b130b51
Compare
|
@pytorchmergebot merge -f "lint and other workflows look good" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
## Summary The release-to-CC dict in \`torch/cuda/__init__.py\` drove the *\"install a PyTorch release that supports one of these CUDA versions: ...\"* recommendation, but had drifted from the actual binary build matrix: - **\"12.6\"** was missing CC \`7.5\` — `.ci/manywheel/build_cuda.sh` puts 7.5 in the base list for every release. - **\"12.8\"** was deprecated in #179072 (replaced by 13.0). Recommending it is misleading and was the proximate cause of #182250. - **\"13.2\"** was missing entirely even though it's in `CUDA_ARCHES` today. That stale data caused the V100 false-positive in #182250 — `cu128` was recommended for a CC 7.0 device, but the actual `cu128` wheel had been built without `sm_70` (arch list \`7.5;8.0;8.6;9.0;10.0;12.0\`). This PR updates the dict to match the union of x86_64 and aarch64 `TORCH_CUDA_ARCH_LIST` in `.ci/manywheel/build_cuda.sh` (the build-time source of truth), and adds a comment pointing readers there so the next \`CUDA_ARCHES\` change knows what else to bump. Authored with Claude. Pull Request resolved: #182358 Approved by: https://github.com/malfet
Fix stale PYTORCH_RELEASES_CODE_CC dict (fixes #182250) (#182358) ## Summary The release-to-CC dict in \`torch/cuda/__init__.py\` drove the *\"install a PyTorch release that supports one of these CUDA versions: ...\"* recommendation, but had drifted from the actual binary build matrix: - **\"12.6\"** was missing CC \`7.5\` — `.ci/manywheel/build_cuda.sh` puts 7.5 in the base list for every release. - **\"12.8\"** was deprecated in #179072 (replaced by 13.0). Recommending it is misleading and was the proximate cause of #182250. - **\"13.2\"** was missing entirely even though it's in `CUDA_ARCHES` today. That stale data caused the V100 false-positive in #182250 — `cu128` was recommended for a CC 7.0 device, but the actual `cu128` wheel had been built without `sm_70` (arch list \`7.5;8.0;8.6;9.0;10.0;12.0\`). This PR updates the dict to match the union of x86_64 and aarch64 `TORCH_CUDA_ARCH_LIST` in `.ci/manywheel/build_cuda.sh` (the build-time source of truth), and adds a comment pointing readers there so the next \`CUDA_ARCHES\` change knows what else to bump. Authored with Claude. Pull Request resolved: #182358 Approved by: https://github.com/malfet (cherry picked from commit f45ab9e) Co-authored-by: atalman <[email protected]>
…ch#182358) ## Summary The release-to-CC dict in \`torch/cuda/__init__.py\` drove the *\"install a PyTorch release that supports one of these CUDA versions: ...\"* recommendation, but had drifted from the actual binary build matrix: - **\"12.6\"** was missing CC \`7.5\` — `.ci/manywheel/build_cuda.sh` puts 7.5 in the base list for every release. - **\"12.8\"** was deprecated in pytorch#179072 (replaced by 13.0). Recommending it is misleading and was the proximate cause of pytorch#182250. - **\"13.2\"** was missing entirely even though it's in `CUDA_ARCHES` today. That stale data caused the V100 false-positive in pytorch#182250 — `cu128` was recommended for a CC 7.0 device, but the actual `cu128` wheel had been built without `sm_70` (arch list \`7.5;8.0;8.6;9.0;10.0;12.0\`). This PR updates the dict to match the union of x86_64 and aarch64 `TORCH_CUDA_ARCH_LIST` in `.ci/manywheel/build_cuda.sh` (the build-time source of truth), and adds a comment pointing readers there so the next \`CUDA_ARCHES\` change knows what else to bump. Authored with Claude. Pull Request resolved: pytorch#182358 Approved by: https://github.com/malfet
Remove CUDA 12.8 from the binary build matrix and regenerate nightly workflows. CUDA 13.0 is already the stable version, making 12.8 redundant.
#178665