-
Notifications
You must be signed in to change notification settings - Fork 772
{tools}[GCCcore/14.3.0] PyTorch v2.9.1, parameterized v0.9.0, pytest-subtests v0.15.0, ... w/ CUDA 12.9.1 #24926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
{tools}[GCCcore/14.3.0] PyTorch v2.9.1, parameterized v0.9.0, pytest-subtests v0.15.0, ... w/ CUDA 12.9.1 #24926
Conversation
…tests-0.15.0-GCCcore-14.3.0.eb, PyTorch-2.9.1-foss-2025b-CUDA-12.9.1.eb, unittest-xml-reporting-3.2.0-GCCcore-14.3.0.eb and patches: PyTorch-1.12.1_add-hypothesis-suppression.patch, PyTorch-1.7.0_disable-dev-shm-test.patch, PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch, PyTorch-2.1.0_remove-test-requiring-online-access.patch, PyTorch-2.6.0_show-test-duration.patch, PyTorch-2.6.0_skip-test_segfault.patch, PyTorch-2.7.0_avoid_caffe2_test_cpp_jit.patch, PyTorch-2.7.1_avoid-caffe2-sandcastle-test-lib.patch, PyTorch-2.7.1_skip-test_data_parallel_rnn.patch, PyTorch-2.7.1_skip-test_gds_fails_in_ci.patch, PyTorch-2.7.1_skip-test_mixed_mm_exhaustive_dtypes.patch, PyTorch-2.7.1_skip-tests-requiring-SM90.patch, PyTorch-2.7.1_suport-64bit-BARs.patch, PyTorch-2.7.1_tolerance-test_partial_flat_weights.patch, PyTorch-2.9.0_disable-test_nan_assert.patch, PyTorch-2.9.0_enable-symbolizer-in-test_workspace_allocation_error.patch, PyTorch-2.9.0_fix-attention-squeeze.patch, PyTorch-2.9.0_fix-FP16-CPU-tests-in-test_torchinductor_opinfo.patch, PyTorch-2.9.0_fix-nccl-test-env.patch, PyTorch-2.9.0_fix-test_exclude_padding.patch, PyTorch-2.9.0_fix-test_version_error.patch, PyTorch-2.9.0_honor-XDG_CACHE_HOME.patch, PyTorch-2.9.0_increase-tolerance-in-test_transformers.patch, PyTorch-2.9.0_remove-faulty-close.patch, PyTorch-2.9.0_revert-pybind11-3-change.patch, PyTorch-2.9.0_skip-test_benchmark_on_non_zero_device.patch, PyTorch-2.9.0_skip-test_convolution1-on-H100.patch, PyTorch-2.9.0_skip-test_inductor_all_gather_into_tensor_coalesced.patch, PyTorch-2.9.0_skip-test_original_aten_preserved_pad_mm.patch, PyTorch-2.9.0_skip-test_override-without-CUDA.patch, PyTorch-2.9.0_skip-test_unbacked_reduction.patch, PyTorch-2.9.0_skip-tests-requiring-CUDA-12.8.patch, PyTorch-2.9.0_skip-unexpected-success-in-test_fake_export.patch, PyTorch-2.9.1_skip-RingFlexAttentionTest.patch
|
Diff of new easyconfig(s) against existing ones is too long for a GitHub comment. Use |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
Are you using the latest easyblock? It is missing this commit from easybuilders/easybuild-easyblocks#3803 |
This comment was marked as outdated.
This comment was marked as outdated.
|
2025b is using GCC 14 that has new warnings. See pytorch/pytorch#166873 Patch added. Seems to only affect ARM |
This comment was marked as outdated.
This comment was marked as outdated.
|
Oh, it is a C file. Updated the patch to also add it to C-flags |
This comment was marked as outdated.
This comment was marked as outdated.
|
Looks like I need to set those values earlier. Can you try again? |
|
Actual failure was an internal GCC compiler error: Test report by @Thyre |
|
Failure may be caused by this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121027 There was a PR which should have worked around this, but seemingly the fix doesn't work?
Maybe we need to patch |
That is not included in this (or any) release yet. I'll add it to the patch list
Would be an option, not sure if it is worth it: This EC is included since EB 5.1.0, although we did that in the past |
|
Test report by @Flamefire |
(created using
eb --new-pr)Includes:
It makes sense to merge #24365 first as any changes there need to be reflected here. But this allows testing both in parallel