Codestin Search App

awni · 2025-12-12T21:09:26Z

Close #2894

awni · 2025-12-12T21:10:12Z

The change in the mma loader is just to speed it up so we don't lose perf using int64 stride for the mask.

angeloskath

This looks great! I presume you ran some tests to check if there is any regression...

awni · 2025-12-13T14:54:21Z

I presume you ran some tests to check if there is any regression...

Yes I ran a benchmark for just SDPA and a model prefill benchmark and there is no change.

In fact just changing to int64 without changing the loader was a consistent 1-2% slowdown on M2 Ultra (so not that bad). Changing the loader brought it back.

Fix attention for large sizes

9d23679

awni requested a review from angeloskath December 12, 2025 21:09

angeloskath approved these changes Dec 12, 2025

View reviewed changes

awni merged commit 47d2505 into main Dec 13, 2025
12 checks passed

awni deleted the fix_attn_large_size branch December 13, 2025 14:54

BrewTestBot mentioned this pull request Dec 18, 2025

mlx 0.30.1 Homebrew/homebrew-core#259125

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix attention for large sizes#2903

Fix attention for large sizes#2903
awni merged 1 commit intomainfrom
fix_attn_large_size

awni commented Dec 12, 2025

Uh oh!

awni commented Dec 12, 2025

Uh oh!

angeloskath left a comment

Uh oh!

awni commented Dec 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

awni commented Dec 12, 2025

Uh oh!

awni commented Dec 12, 2025

Uh oh!

angeloskath left a comment

Choose a reason for hiding this comment

Uh oh!

awni commented Dec 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants