-
Notifications
You must be signed in to change notification settings - Fork 24.1k
[FlexAttention] Allow num_warps 8 since when block size >=128 #143299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/143299
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit a5420d8 with merge base 7ab3177 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes pytorch#143331 Pull Request resolved: pytorch#143344 Approved by: https://github.com/Chillee ghstack dependencies: pytorch#143299
…h#143299) # Summary Fixes pytorch#143290 We already strip bad configs here: https://github.com/pytorch/pytorch/blob/e0e763e33135d2ad25c613007aa5f2fee6d2cc24/torch/_inductor/kernel/flex_attention.py#L2299 So this shouldn't be needed. Confirming that the 64 x 128 case is valid otherwise we can just change the default config Pull Request resolved: pytorch#143299 Approved by: https://github.com/yanboliang
Fixes pytorch#143331 Pull Request resolved: pytorch#143344 Approved by: https://github.com/Chillee ghstack dependencies: pytorch#143299
ghstack-source-id: 6992e3a Pull Request resolved: pytorch/pytorch#143299
…h#143299) # Summary Fixes pytorch#143290 We already strip bad configs here: https://github.com/pytorch/pytorch/blob/e0e763e33135d2ad25c613007aa5f2fee6d2cc24/torch/_inductor/kernel/flex_attention.py#L2299 So this shouldn't be needed. Confirming that the 64 x 128 case is valid otherwise we can just change the default config Pull Request resolved: pytorch#143299 Approved by: https://github.com/yanboliang
Fixes pytorch#143331 Pull Request resolved: pytorch#143344 Approved by: https://github.com/Chillee ghstack dependencies: pytorch#143299
…h#143299) # Summary Fixes pytorch#143290 We already strip bad configs here: https://github.com/pytorch/pytorch/blob/e0e763e33135d2ad25c613007aa5f2fee6d2cc24/torch/_inductor/kernel/flex_attention.py#L2299 So this shouldn't be needed. Confirming that the 64 x 128 case is valid otherwise we can just change the default config Pull Request resolved: pytorch#143299 Approved by: https://github.com/yanboliang
Fixes pytorch#143331 Pull Request resolved: pytorch#143344 Approved by: https://github.com/Chillee ghstack dependencies: pytorch#143299
Adding to 2.6.1 as requested by Runway |
Is that not worth making a patch release 2.6.1? |
Stack from ghstack (oldest at bottom):
Summary
Fixes #143290
We already strip bad configs here:
pytorch/torch/_inductor/kernel/flex_attention.py
Line 2299 in e0e763e
So this shouldn't be needed. Confirming that the 64 x 128 case is valid otherwise we can just change the default config
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @Chillee @yanboliang @BoyuanFeng