forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 146
Pull requests: vllm-project/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix illegal memory access in FA2 varlen SplitKV early-exit LSE write
#139
opened May 18, 2026 by
wangyxbh
Loading…
[Perf] SM103 tcgen05.ld.red for fused TMEM load + row-max in softmax
#131
opened Apr 9, 2026 by
LopezCastroRoberto
Loading…
Combine kernel: increase pipeline depth from 4 to 8 stages
#124
opened Mar 4, 2026 by
jmkuebler
Loading…
[Frontend] Add FP8 output quantization support to FlashAttention backend
#113
opened Jan 3, 2026 by
sachinkumarsingh092
Loading…
[Kernel] add attention sinks for flash attention2
#103
opened Oct 19, 2025 by
dudugong-gitch
Loading…
Removed the assertion imposed on cu_seqlens_k and seqused_k
#59
opened Mar 29, 2025 by
chenyang78
Loading…
Add back flash_attn_func api (and support FA3) [Don't Merge Yet]
#40
opened Jan 26, 2025 by
LucasWilkinson
Collaborator
Loading…
ProTip!
What’s not been updated in a month: updated:<2026-05-05.