Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[AArch64] Enable dead register definitions at O0 #145174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AZero13
Copy link
Contributor

@AZero13 AZero13 commented Jun 21, 2025

This is because the assembler/disassembler will otherwise show this as an adds or subs and not cmp or cmn. Even gcc does this. And, this is just allocating register to 0 reg. Will this optimize -O0? Well, minimally it will put less register pressure maybe, but that is not the point. The point is that it should resolve to aliases for better understanding, which is why -O0 exists to begin with.

@llvmbot
Copy link
Member

llvmbot commented Jun 21, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: AZero13 (AZero13)

Changes

This is because the assembler/disassembler will otherwise show this as an adds or subs and not cmp or cmn. Even gcc does this. And, this is just allocating register to 0 reg. Will this optimize -O0? Well, minimally it will put less register pressure maybe, but that is not the point. The point is that it should resolve to aliases for better understanding, which is why -O0 exists to begin with.


Full diff: https://github.com/llvm/llvm-project/pull/145174.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64TargetMachine.cpp (+2-2)
  • (added) llvm/test/CodeGen/AArch64/fast-isel-O0-cmp (+38)
diff --git a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
index 8150e91c8ba52..cec2c1b8374c6 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
@@ -803,8 +803,8 @@ bool AArch64PassConfig::addILPOpts() {
 
 void AArch64PassConfig::addPreRegAlloc() {
   // Change dead register definitions to refer to the zero register.
-  if (TM->getOptLevel() != CodeGenOptLevel::None &&
-      EnableDeadRegisterElimination)
+  // This is beneficial even at -O0 as we can show CMP/CMN in the assembler output.
+  if (EnableDeadRegisterElimination)
     addPass(createAArch64DeadRegisterDefinitions());
 
   // Use AdvSIMD scalar instructions whenever profitable.
diff --git a/llvm/test/CodeGen/AArch64/fast-isel-O0-cmp b/llvm/test/CodeGen/AArch64/fast-isel-O0-cmp
new file mode 100644
index 0000000000000..e5d97df40db25
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/fast-isel-O0-cmp
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=aarch64-linux-gnu -O0 -fast-isel -fast-isel-abort=1 -verify-machineinstrs < %s | FileCheck %s --check-prefix=CHECK
+
+; even in -O0, cmp should be cmp
+define i1 @cmp(i32 %0) {
+; CHECK-LABEL: cmp:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    cmp w0, #5
+; CHECK-NEXT:    cset w0, gt
+; CHECK-NEXT:    // kill: def $w1 killed $w0
+; CHECK-NEXT:    ret
+  %2 = icmp sgt i32 %0, 5
+  ret i1 %2
+}
+
+define i1 @cmn(i32 %0) {
+; CHECK-LABEL: cmn:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    cmn w0, #5
+; CHECK-NEXT:    cset w0, gt
+; CHECK-NEXT:    // kill: def $w1 killed $w0
+; CHECK-NEXT:    ret
+  %2 = icmp sgt i32 %0, -5
+  ret i1 %2
+}
+
+; Test that 0 is cmp
+define i1 @cmp0(i32 %0) {
+; CHECK-LABEL: cmp0:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    cmp w0, #0
+; CHECK-NEXT:    cset w0, gt
+; CHECK-NEXT:    // kill: def $w1 killed $w0
+; CHECK-NEXT:    ret
+  %2 = icmp sgt i32 %0, 0
+  ret i1 %2
+}
+

Copy link

github-actions bot commented Jun 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

This is because the assembler/disassembler will otherwise show this as an adds or subs and not cmp or cmn. Even gcc does this. And, this is just allocating register to 0 reg. Will this optimize -O0? Well, minimally it will put less register pressure maybe, but that is not the point. The point is that it should resolve to aliases for better understanding, which is why -O0 exists to begin with.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants