Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[inductor][cutlass] Add neg() and constant() EVT ops for SiLU epilogue fusion#186197

Open
xuhancn wants to merge 1 commit into
pytorch:mainfrom
xuhancn:xu_evt_neg_constant_ops
Open

[inductor][cutlass] Add neg() and constant() EVT ops for SiLU epilogue fusion#186197
xuhancn wants to merge 1 commit into
pytorch:mainfrom
xuhancn:xu_evt_neg_constant_ops

Conversation

@xuhancn
Copy link
Copy Markdown
Collaborator

@xuhancn xuhancn commented Jun 4, 2026

Summary

Implement the neg() and constant() operations in CutlassEVTOpsMixIn that were previously raising NotImplementedError.

Changes

  • constant(value, dtype): Returns str(float(value)) — consistent with CUTLASS PythonASTFrontend literal parsing (e.g., 1"1.0")
  • neg(x): Emits (0.0 - x) instead of unary minus because CUTLASS PythonASTFrontend has visit_BinOp but no visit_UnaryOp

Motivation

These ops are required for decomposed SiLU to be representable in the CUTLASS EVT Python DSL:

  • SiLU decomposes to x / (1 + exp(-x))
  • This requires both neg() (for -x) and constant(1) (for the literal 1)

Without these ops, _can_fuse_epilogue_impl() rejects any epilogue containing SiLU, preventing GEMM + SiLU fusion.

Testing

  • test_py_codegen_neg_constant: Validates EVT code generation for constant(1) + exp(neg(x))
  • test_py_codegen_sigmoid_decomposed: Validates full sigmoid decomposition 1/(1+exp(-x)) with composed neg/constant/exp/add/truediv

cc @eellison @etaf

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/186197

Note: Links to docs will display an error until the docs builds have been completed.

❌ 15 Pending, 1 Unrelated Failure, 6 Unclassified Failures

As of commit 6622b21 with merge base cc46af7 (image):

UNCLASSIFIED FAILURES - DrCI could not classify the following jobs because the workflow did not run on the merge base. The failures may be pre-existing on trunk or introduced by this PR:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…e fusion

Add two missing EVT ops needed by decomposed SiLU (x / (1 + exp(-x))):

- neg(x): Emits (0.0 - x) since CUTLASS PythonASTFrontend lacks visit_UnaryOp
- constant(value, dtype): Returns str(float(value)) for CUTLASS literal parsing

Without these ops, any epilogue containing neg or constant raises
NotImplementedError, preventing SiLU (and sigmoid) from being fused into
CUTLASS GEMM epilogues.

Tests:
- test_py_codegen_neg_constant: validates neg + constant + exp composition
- test_py_codegen_sigmoid_decomposed: validates full sigmoid decomposition 1/(1+exp(-x))

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks module: inductor open source topic: not user facing topic category

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants