Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[RISCV] Implement codegen for XAndesPerf lea instructions #137925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

tclin914
Copy link
Contributor

This patch add the patterns for generating XAndesPerf lea instructions.

The operation of LEA family instructions is:

rd = rs1 + rs2 * (the number of bytes)

The variants with *.ze suffix are RV64 only and its operation is:

rd = rs1 + ZE32(rs2[31:0]) * (the number of bytes)

This patch add the patterns for generating XAndesPerf lea instructions.

The operation of LEA family instructions is:

  rd = rs1 + rs2 * (the number of bytes)

The variants with *.ze suffix are RV64 only and its operation is:

  rd = rs1 + ZE32(rs2[31:0]) * (the number of bytes)
@llvmbot
Copy link
Member

llvmbot commented Apr 30, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Jim Lin (tclin914)

Changes

This patch add the patterns for generating XAndesPerf lea instructions.

The operation of LEA family instructions is:

rd = rs1 + rs2 * (the number of bytes)

The variants with *.ze suffix are RV64 only and its operation is:

rd = rs1 + ZE32(rs2[31:0]) * (the number of bytes)


Full diff: https://github.com/llvm/llvm-project/pull/137925.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td (+25)
  • (added) llvm/test/CodeGen/RISCV/rv32xandesperf.ll (+33)
  • (added) llvm/test/CodeGen/RISCV/rv64xandesperf.ll (+46)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td
index 2ec768435259c..e59a7c9353b8a 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td
@@ -356,3 +356,28 @@ def NDS_LDGP  : NDSRVInstLDGP<0b011, "nds.ldgp">;
 def NDS_SDGP  : NDSRVInstSDGP<0b111, "nds.sdgp">;
 } // Predicates = [HasVendorXAndesPerf, IsRV64]
 } // DecoderNamespace = "XAndes"
+
+// Patterns
+
+let Predicates = [HasVendorXAndesPerf] in {
+class NDS_LEAPat<int shamt, RVInstR Inst>
+    : Pat<(add (XLenVT GPR:$rs1), (shl GPR:$rs2, (XLenVT shamt))),
+          (Inst GPR:$rs1, GPR:$rs2)>;
+
+def : NDS_LEAPat<1, NDS_LEA_H>;
+def : NDS_LEAPat<2, NDS_LEA_W>;
+def : NDS_LEAPat<3, NDS_LEA_D>;
+} // Predicates = [HasVendorXAndesPerf]
+
+let Predicates = [HasVendorXAndesPerf, IsRV64] in {
+def : Pat<(add (XLenVT GPR:$rs1), (zexti32 (i64 GPR:$rs2))),
+          (NDS_LEA_B_ZE GPR:$rs1, GPR:$rs2)>;
+
+class NDS_LEA_ZEPat<int shamt, RVInstR Inst>
+    : Pat<(add GPR:$rs1, (shl (zexti32 (XLenVT GPR:$rs2)), (XLenVT shamt))),
+          (Inst GPR:$rs1, GPR:$rs2)>;
+
+def : NDS_LEA_ZEPat<1, NDS_LEA_H_ZE>;
+def : NDS_LEA_ZEPat<2, NDS_LEA_W_ZE>;
+def : NDS_LEA_ZEPat<3, NDS_LEA_D_ZE>;
+} // Predicates = [HasVendorXAndesPerf, IsRV64]
diff --git a/llvm/test/CodeGen/RISCV/rv32xandesperf.ll b/llvm/test/CodeGen/RISCV/rv32xandesperf.ll
new file mode 100644
index 0000000000000..e99c9ee5af587
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rv32xandesperf.ll
@@ -0,0 +1,33 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -O0 -mtriple=riscv32 -mattr=+xandesperf -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s
+
+define i32 @lea_h(i32 %a, i32 %b) {
+; CHECK-LABEL: lea_h:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.h a0, a0, a1
+; CHECK-NEXT:    ret
+  %shl = shl i32 %b, 1
+  %ret = add i32 %a, %shl
+  ret i32 %ret
+}
+
+define i32 @lea_w(i32 %a, i32 %b) {
+; CHECK-LABEL: lea_w:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.w a0, a0, a1
+; CHECK-NEXT:    ret
+  %shl = shl i32 %b, 2
+  %ret = add i32 %a, %shl
+  ret i32 %ret
+}
+
+define i32 @lea_d(i32 %a, i32 %b) {
+; CHECK-LABEL: lea_d:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.d a0, a0, a1
+; CHECK-NEXT:    ret
+  %shl = shl i32 %b, 3
+  %ret = add i32 %a, %shl
+  ret i32 %ret
+}
diff --git a/llvm/test/CodeGen/RISCV/rv64xandesperf.ll b/llvm/test/CodeGen/RISCV/rv64xandesperf.ll
new file mode 100644
index 0000000000000..6349272847a4a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rv64xandesperf.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv64 -mattr=+xandesperf -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s
+
+define i64 @lea_b_ze(i32 %a, i64 %b) {
+; CHECK-LABEL: lea_b_ze:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.b.ze a0, a1, a0
+; CHECK-NEXT:    ret
+  %conv = zext i32 %a to i64
+  %add = add i64 %conv, %b
+  ret i64 %add
+}
+
+define i64 @lea_h_ze(i32 %a, i64 %b) {
+; CHECK-LABEL: lea_h_ze:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.h.ze a0, a1, a0
+; CHECK-NEXT:    ret
+  %conv = zext i32 %a to i64
+  %shl = shl nuw nsw i64 %conv, 1
+  %add = add i64 %shl, %b
+  ret i64 %add
+}
+
+define i64 @lea_w_ze(i32 %a, i64 %b) {
+; CHECK-LABEL: lea_w_ze:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.w.ze a0, a1, a0
+; CHECK-NEXT:    ret
+  %conv = zext i32 %a to i64
+  %shl = shl nuw nsw i64 %conv, 2
+  %add = add i64 %shl, %b
+  ret i64 %add
+}
+
+define i64 @lea_d_ze(i32 %a, i64 %b) {
+; CHECK-LABEL: lea_d_ze:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    nds.lea.d.ze a0, a1, a0
+; CHECK-NEXT:    ret
+  %conv = zext i32 %a to i64
+  %shl = shl nuw nsw i64 %conv, 3
+  %add = add i64 %shl, %b
+  ret i64 %add
+}


let Predicates = [HasVendorXAndesPerf] in {
class NDS_LEAPat<int shamt, RVInstR Inst>
: Pat<(add (XLenVT GPR:$rs1), (shl GPR:$rs2, (XLenVT shamt))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use add_like_non_imm12 instead of add. That's what we do for sh1add/sh2add/sh3add from Zba. That will handle or disjoint and prevent using the instruction if an ADDI could be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I've updated the patterns for Zba to be multiclass/class so they can be reused.

(NDS_LEA_B_ZE GPR:$rs1, GPR:$rs2)>;

class NDS_LEA_ZEPat<int shamt, RVInstR Inst>
: Pat<(add GPR:$rs1, (shl (zexti32 (XLenVT GPR:$rs2)), (XLenVT shamt))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want zexti32 here. That will use this form if an AND is present or if the upper 32 bits are known to be 0. You probably only want the AND case. Use (and GPR:$rs2, 0xFFFFFFFF) like we do for sh1add/sh2add/sh3add.

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, they are equivalent to Zba? Maybe we should reuse the base pattern?

@tclin914
Copy link
Contributor Author

IIUC, they are equivalent to Zba? Maybe we should reuse the base pattern?

Thanks for the reminder. I've update the patterns for Zba that we can reuse it.

GPR:$r)>;

let Predicates = [HasStdExtZba] in {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this blank line and indent everything in this let by 2 spaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bodies of the other let in this .td file don't have 2 spaces indentation. If I only make the change here, it won't have a consistent style throughout the file.

tclin914 added 2 commits May 1, 2025 14:26
In Zba, rs1 is index register and rs2 is base register. But for lea
instructions, rs1 is base register and rs2 is index register.
@tclin914
Copy link
Contributor Author

tclin914 commented May 7, 2025

Kindly ping.

@topperc
Copy link
Collaborator

topperc commented May 7, 2025

This is failing MC tests probably because you changed the order of $rs1 and $rs2 which broke the encoding/decoding.

@tclin914
Copy link
Contributor Author

tclin914 commented May 7, 2025

This is failing MC tests probably because you changed the order of $rs1 and $rs2 which broke the encoding/decoding.

Fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants