Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[RISCV][MC] Add aliases for beq/bne with x0 as the first argument => beqz/bnez #139086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

asb
Copy link
Contributor

@asb asb commented May 8, 2025

I don't see a good reason not to use the alias for this form as well. We have CompressPat for both ways round, so the difference isn't meaningful from that perspective. Because we're mostly compiling with RVC enabled, you mostly see the beq/bne forms with x0 as the first operand in disassembly in the case of the second operand being a register that isn't compressible. I think this just makes assembly harder to read and sticks out for no good reason in such cases.

asb added 2 commits May 8, 2025 14:59
I don't see a good reason not to use the alias for this form as well.
We have CompressPat for both ways round, so the difference isn't
meaningful from that perspective. Because we're mostly compiling with
RVC enabled, you only see the beq/bne forms with x0 as the first
operand in the case of the second operand being a register that isn't
compressible. I think this just makes assembly harder to read and sticks
out for no good reason in such cases.
@asb asb requested review from lenary, preames, apazos and topperc May 8, 2025 14:10
@llvmbot llvmbot added backend:RISC-V mc Machine (object) code labels May 8, 2025
@llvmbot
Copy link
Member

llvmbot commented May 8, 2025

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-risc-v

Author: Alex Bradbury (asb)

Changes

I don't see a good reason not to use the alias for this form as well.
We have CompressPat for both ways round, so the difference isn't
meaningful from that perspective. Because we're mostly compiling with
RVC enabled, you only see the beq/bne forms with x0 as the first
operand in the case of the second operand being a register that isn't
compressible. I think this just makes assembly harder to read and sticks
out for no good reason in such cases.


Full diff: https://github.com/llvm/llvm-project/pull/139086.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+4)
  • (modified) llvm/test/MC/RISCV/rvi-aliases-valid.s (+16-4)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index 4a4290483e94b..361233d20dc1c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -965,8 +965,12 @@ def : InstAlias<"sgtu $rd, $rs, $rt", (SLTU GPR:$rd, GPR:$rt, GPR:$rs), 0>;
 
 def : InstAlias<"beqz $rs, $offset",
                 (BEQ GPR:$rs,      X0, bare_simm13_lsb0:$offset)>;
+def : InstAlias<"beqz $rs, $offset",
+                (BEQ      X0, GPR:$rs, bare_simm13_lsb0:$offset)>;
 def : InstAlias<"bnez $rs, $offset",
                 (BNE GPR:$rs,      X0, bare_simm13_lsb0:$offset)>;
+def : InstAlias<"bnez $rs, $offset",
+                (BNE      X0, GPR:$rs, bare_simm13_lsb0:$offset)>;
 def : InstAlias<"blez $rs, $offset",
                 (BGE      X0, GPR:$rs, bare_simm13_lsb0:$offset)>;
 def : InstAlias<"bgez $rs, $offset",
diff --git a/llvm/test/MC/RISCV/rvi-aliases-valid.s b/llvm/test/MC/RISCV/rvi-aliases-valid.s
index ef05d1295d44f..0082b12797da6 100644
--- a/llvm/test/MC/RISCV/rvi-aliases-valid.s
+++ b/llvm/test/MC/RISCV/rvi-aliases-valid.s
@@ -122,10 +122,22 @@ bgtu x17, x18, 28
 # CHECK-OBJ: bgeu s3, s2, 0x70
 bleu x18, x19, 32
 
+# Emit beqz/bnez alias even when operands are reversed to the canonical form.
+# CHECK-S-NOALIAS: beq zero, a0, 512
+# CHECK-S: beqz a0, 512
+# CHECK-OBJ-NOALIAS: beq zero, a0, 0x254
+# CHECK-OBJ: beqz a0, 0x254
+beq zero, x10, 512
+# CHECK-S-NOALIAS: bne zero, a1, 1024
+# CHECK-S: bnez a1, 1024
+# CHECK-OBJ-NOALIAS: bne zero, a1, 0x458
+# CHECK-OBJ: bnez a1, 0x458
+bne zero, x11, 1024
+
 # CHECK-S-NOALIAS: jal zero, 2044
 # CHECK-S: j 2044
-# CHECK-OBJ-NOALIAS: jal zero, 0x850
-# CHECK-OBJ: j 0x850
+# CHECK-OBJ-NOALIAS: jal zero, 0x858
+# CHECK-OBJ: j 0x858
 j 2044
 # CHECK-S-NOALIAS: jal zero, foo
 # CHECK-S: j foo
@@ -148,8 +160,8 @@ j a0
 j .
 # CHECK-S-NOALIAS: jal ra, 2040
 # CHECK-S: jal 2040
-# CHECK-OBJ-NOALIAS: jal ra, 0x85c
-# CHECK-OBJ: jal 0x85c
+# CHECK-OBJ-NOALIAS: jal ra, 0x864
+# CHECK-OBJ: jal 0x864
 jal 2040
 # CHECK-S-NOALIAS: jal ra, foo
 # CHECK-S: jal foo

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and does seem to be neater, for these commutable cases.

Copy link
Member

@mikhailramalho mikhailramalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@topperc
Copy link
Collaborator

topperc commented May 8, 2025

Can we get objdump to do the same?

@asb
Copy link
Contributor Author

asb commented May 8, 2025

Can we get objdump to do the same?

As in GNU objdump? That would be nice for sure. I don't think it's necessary to block on them following suite though. CC @kito-cheng @HankChang736

@topperc
Copy link
Collaborator

topperc commented May 8, 2025

Because we're mostly compiling with
RVC enabled, you only see the beq/bne forms with x0 as the first
operand in disassembly in the case of the second operand being a register that isn't
compressible.

c.beqz/c.bnez has a smaller displacement range too doesn't it?

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@topperc
Copy link
Collaborator

topperc commented May 8, 2025

Are these case also caused by MachineCopyPropagation and TailDuplication?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 8, 2025

Do we need to set explicit priorities to ensure we get the canonical expansion of beqz/bnez for assembly input?

@@ -965,8 +965,12 @@ def : InstAlias<"sgtu $rd, $rs, $rt", (SLTU GPR:$rd, GPR:$rt, GPR:$rs), 0>;

def : InstAlias<"beqz $rs, $offset",
(BEQ GPR:$rs, X0, bare_simm13_lsb0:$offset)>;
def : InstAlias<"beqz $rs, $offset",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What guarantees the canonical order will be first in the matching table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. Firstly it's worth noting that we do have test coverage that will catch if the order of the aliases is switched for some reason. Tracing through how it's ordered today:

  • AsmMatcherEmitter adds MatchableInfo generated from the InstAlias to the Matchables vector in the order they are returned from getAllDerivedDefinitions. This isn't explicitly documented as being the source order in the .td, but appears to be so (and I imagine other things would break if that were changed).
  • The Matchables are then sorted using a stable sort.
  • Looking at beqz as an example, neither definition is less than the other according to the comparison function (same number of operands, operands compare the same. It compares two operands MCK_GPR == MCK_GPR, MCK_BareSImm13Lsb0 == MCK_BareSImm13Lsb0 (i.e. it is comparing the alias rather than the transformed instruction). And ultimately the lessthan comparison function just returns fals, so the aliases remain in source order in the matching table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the initial sort is by Record name. Anonymous records don't have name so they are sorted by the source order.

@topperc
Copy link
Collaborator

topperc commented May 8, 2025

Do we need to set explicit priorities to ensure we get the canonical expansion of beqz/bnez for assembly input?

There is no priority mechanism. EmitPriority is only for emission not input.

@asb
Copy link
Contributor Author

asb commented May 11, 2025

c.beqz/c.bnez has a smaller displacement range too doesn't it?

Yes, I've clarified this in the PR description.

Are these case also caused by MachineCopyPropagation and TailDuplication?

The ones I've looked at yes. We could canonicalise such branches in SimplifyInstruction, but I think the main motivation would be for assembly printing (there's a potential secondary motivation for canonicalisation that makes other code easier to write, but I don't think it has an impact as things stand today). As it's such a trivial variant, supporting it directly in the MC layer makes sense to me.

@topperc
Copy link
Collaborator

topperc commented May 12, 2025

c.beqz/c.bnez has a smaller displacement range too doesn't it?

Yes, I've clarified this in the PR description.

Are these case also caused by MachineCopyPropagation and TailDuplication?

The ones I've looked at yes. We could canonicalise such branches in SimplifyInstruction, but I think the main motivation would be for assembly printing (there's a potential secondary motivation for canonicalisation that makes other code easier to write, but I don't think it has an impact as things stand today). As it's such a trivial variant, supporting it directly in the MC layer makes sense to me.

Canonicalizing in SimplifyInstruction would ensure that llvm-objudmp and GNU objdump both print the same output without needing to change GNU objdump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants