-
Notifications
You must be signed in to change notification settings - Fork 24.1k
[TEST][ATen][CUDA] Skip row-wise scaled matrix mmultiplication tests on sm_120+ #152814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[TEST][ATen][CUDA] Skip row-wise scaled matrix mmultiplication tests on sm_120+ #152814
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed after #148421 ?
I'm seeing that on a fresh source build
test_float8_rowwise_scaling_sanity_use_fast_accum_True_cuda
test_float8_rowwise_scaling_sanity_use_fast_accum_False_cuda
test_scaled_mm_vs_emulated_row_wise
all pass
My bad, it is needed for |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
dec9eab
to
c02fe40
Compare
Hmm, is this because CUTLASS is missing those specializations? Or because we are missing something minor on our side that would unblock support? Like a missing template specialization, overly restrictive dispatch logic, or missing cmake are to build those kernels from SM120? |
Doesn't it support it here? Are we missing dispatch logic?
|
Does just adding SM120 here fix it or is SM100 not compatible with SM120?
|
No, it is not as trivial as adding |
The float8 row-wise scaled matmuls are not supported on Blackwell yet. This PR adds skips to those tests to decrease the noise on
sm_120+
machines.cc @ptrblck @msaroufim @eqy @jerryzh168