[Codegen] Fix ConvertAccGemmToGemm to handle read-write arguments correctly. #22975

MaheshRavishankar · 2025-12-23T22:00:23Z

The current pass unconditionally converts from an accumulating GEMM to a non-accumulating GEMM. This transformation is only required when the outs arguments is coming from a read-only buffer. When coming from a read-write buffer, the accumulating gemm can be handled as is, and it does not need to converted to a non-accumulating gemm.

For a function input like

func.func @acc_gemm(%lhs : tensor<?x?xf32>, %rhs: tensor<?x?xf32>,
    %init : tensor<?x?xf32> {iree.abi.output = 0}) -> tensor<?x?xf32> {
  %0 = linalg.matmul ins(%lhs, %rhs : tensor<?x?xf32>, tensor<?x?xf32>)
      outs(%init : tensor<?x?xf32>) -> tensor<?x?xf32>
  return %0 : tensor<?x?xf32>
}

The dispatch sees an single init binding that is read-write. In those cases we dont need to convert the linalg.matmul into a non-accumulating GEMM. This case can be (and is currently) handled natively.

ci-extra: test_torch

…rectly. The current pass unconditionally converts from an accumulating GEMM to a non-accumulating GEMM. This transformation is only required when the `outs` arguments is coming from a read-only buffer. When coming from a read-write buffer, the accumulating gemm can be handled as is, and it does not need to converted to a non-accumulating gemm. For a function input like ``` func.func @acc_gemm(%lhs : tensor<?x?xf32>, %rhs: tensor<?x?xf32>, %init : tensor<?x?xf32> {iree.abi.output = 0}) -> tensor<?x?xf32> { %0 = linalg.matmul ins(%lhs, %rhs : tensor<?x?xf32>, tensor<?x?xf32>) outs(%init : tensor<?x?xf32>) -> tensor<?x?xf32> return %0 : tensor<?x?xf32> } ``` The dispatch sees an single `init` binding that is read-write. In those cases we dont need to convert the `linalg.matmul` into a non-accumulating GEMM. This case can be (and is currently) handled natively. Signed-off-by: MaheshRavishankar <[email protected]>

jtuyls

LGTM

compiler/src/iree/compiler/Codegen/Common/ConvertAccGEMMToGEMMPass.cpp

nirvedhmeshram · 2025-12-24T16:25:03Z

I would like to verify this issue is not again caused by this PR #19546
I will update once I do a check, please hold on landing until then.

nirvedhmeshram

I will note though in practice with the {iree.abi.output = 0}) I am still seeing a readonly tensor and this pass is still making the elementwise, ideally I would have liked to be able to generate the readwrite tensor and see how the underlying codegen is handling it.
here is the IR dump.

MaheshRavishankar · 2025-12-24T16:45:53Z

I will note though in practice with the {iree.abi.output = 0}) I am still seeing a readonly tensor and this pass is still making the elementwise, ideally I would have liked to be able to generate the readwrite tensor and see how the underlying codegen is handling it.
here is the IR dump.

I'll take a look. I tried a matmul and it was working as expected.

nirvedhmeshram

seems like the

/ -----// IR Dump After GPUCombineLayoutTransformationPass (iree-codegen-gpu-combine-layout-transformation) //----- //

pass is turning the readwrite tensor to read tensor, but I turned it off (with --iree-llvmgpu-test-combine-layout-transformation=false) and can confirm things workout for the shape
but I tried a shape that does need padding and there is an issue, with a large private allocation see gist here

However, turning it into a elementwise add also has the same issue as documented in #22919 so that is an unrelated problem.

…re_to_buffer`. Signed-off-by: MaheshRavishankar <[email protected]>

Signed-off-by: MaheshRavishankar <[email protected]>

MaheshRavishankar · 2025-12-29T05:30:14Z

seems like the
/ -----// IR Dump After GPUCombineLayoutTransformationPass (iree-codegen-gpu-combine-layout-transformation) //----- //
pass is turning the readwrite tensor to read tensor, but I turned it off (with --iree-llvmgpu-test-combine-layout-transformation=false) and can confirm things workout for the shape but I tried a shape that does need padding and there is an issue, with a large private allocation see gist here

However, turning it into a elementwise add also has the same issue as documented in #22919 so that is an unrelated problem.

Thanks @nirvedhmeshram . I looked further and adapted the pass to handle iree_codegen.load_from_buffer and iree_codegen.store_to_buffer. Hopefully that makes these cases easier. There is still a private allocation being created that needs padding. I think this is indeed a pre-existing problem, but hopefully this makes it easier to handle.

Signed-off-by: MaheshRavishankar <[email protected]>

…rectly. (#22975) The current pass unconditionally converts from an accumulating GEMM to a non-accumulating GEMM. This transformation is only required when the `outs` arguments is coming from a read-only buffer. When coming from a read-write buffer, the accumulating gemm can be handled as is, and it does not need to converted to a non-accumulating gemm. For a function input like ``` func.func @acc_gemm(%lhs : tensor<?x?xf32>, %rhs: tensor<?x?xf32>, %init : tensor<?x?xf32> {iree.abi.output = 0}) -> tensor<?x?xf32> { %0 = linalg.matmul ins(%lhs, %rhs : tensor<?x?xf32>, tensor<?x?xf32>) outs(%init : tensor<?x?xf32>) -> tensor<?x?xf32> return %0 : tensor<?x?xf32> } ``` The dispatch sees an single `init` binding that is read-write. In those cases we dont need to convert the `linalg.matmul` into a non-accumulating GEMM. This case can be (and is currently) handled natively. ci-extra: test_torch --------- Signed-off-by: MaheshRavishankar <[email protected]> Signed-off-by: Keshav Vinayak Jha <[email protected]>

MaheshRavishankar requested review from Max191, hanhanW and qedawkins as code owners December 23, 2025 22:00

MaheshRavishankar requested review from Groverkss, Max191, hanhanW, jtuyls and nirvedhmeshram and removed request for Max191, hanhanW, jtuyls, nirvedhmeshram and qedawkins December 23, 2025 22:00

jtuyls approved these changes Dec 24, 2025

View reviewed changes

compiler/src/iree/compiler/Codegen/Common/ConvertAccGEMMToGEMMPass.cpp Outdated Show resolved Hide resolved

compiler/src/iree/compiler/Codegen/Common/ConvertAccGEMMToGEMMPass.cpp Outdated Show resolved Hide resolved

nirvedhmeshram approved these changes Dec 24, 2025

View reviewed changes

MaheshRavishankar added 2 commits December 28, 2025 21:25

Adapt to cases with iree_codegen.load_from_buffer/`iree_codegen.sto…

1711053

…re_to_buffer`. Signed-off-by: MaheshRavishankar <[email protected]>

Fix pre-commit.

041d9bf

Signed-off-by: MaheshRavishankar <[email protected]>

Empty Commit to trigger CI

d232f45

Signed-off-by: MaheshRavishankar <[email protected]>

MaheshRavishankar merged commit 55430fd into iree-org:main Dec 30, 2025
98 of 102 checks passed

Yu-Zhewen mentioned this pull request Dec 30, 2025

[Codegen] Shared memory exceeded the limit for e2e_matmul_cdna4_mxfp4_rocm_hip_matmul #22992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen] Fix ConvertAccGemmToGemm to handle read-write arguments correctly. #22975

[Codegen] Fix ConvertAccGemmToGemm to handle read-write arguments correctly. #22975

Uh oh!

MaheshRavishankar commented Dec 23, 2025 •

edited

Loading

Uh oh!

jtuyls left a comment

Uh oh!

Uh oh!

Uh oh!

nirvedhmeshram commented Dec 24, 2025

Uh oh!

nirvedhmeshram left a comment •

edited

Loading

Uh oh!

MaheshRavishankar commented Dec 24, 2025

Uh oh!

nirvedhmeshram left a comment •

edited

Loading

Uh oh!

MaheshRavishankar commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Codegen] Fix ConvertAccGemmToGemm to handle read-write arguments correctly. #22975

[Codegen] Fix ConvertAccGemmToGemm to handle read-write arguments correctly. #22975

Uh oh!

Conversation

MaheshRavishankar commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jtuyls left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nirvedhmeshram commented Dec 24, 2025

Uh oh!

nirvedhmeshram left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaheshRavishankar commented Dec 24, 2025

Uh oh!

nirvedhmeshram left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaheshRavishankar commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MaheshRavishankar commented Dec 23, 2025 •

edited

Loading

nirvedhmeshram left a comment •

edited

Loading

nirvedhmeshram left a comment •

edited

Loading