[AArch64] Utilize `XAR` for certain vector rotates #137629

Rajveer100 · 2025-04-28T13:16:22Z

Resolves #137162

For cases when there isn't any XOR in the transformation, replace with a zero register.

llvmbot · 2025-04-28T13:17:02Z

@llvm/pr-subscribers-backend-aarch64

Author: Rajveer Singh Bharadwaj (Rajveer100)

Changes

Resolves #137162

For cases when there isn't any XOR in the transformation, replace with a zero register.

Full diff: https://github.com/llvm/llvm-project/pull/137629.diff

1 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+24-9)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 40944e3d43d6b..b0559692331d8 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -13,6 +13,7 @@
 #include "AArch64MachineFunctionInfo.h"
 #include "AArch64TargetMachine.h"
 #include "MCTargetDesc/AArch64AddressingModes.h"
+#include "MCTargetDesc/AArch64MCTargetDesc.h"
 #include "llvm/ADT/APSInt.h"
 #include "llvm/CodeGen/ISDOpcodes.h"
 #include "llvm/CodeGen/SelectionDAGISel.h"
@@ -4558,9 +4559,15 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
         !TLI->isAllActivePredicate(*CurDAG, N1.getOperand(0)))
       return false;
 
-    SDValue XOR = N0.getOperand(1);
-    if (XOR.getOpcode() != ISD::XOR || XOR != N1.getOperand(1))
-      return false;
+    SDValue R1, R2;
+    if (N0.getOperand(1).getOpcode() != ISD::XOR) {
+      if (N0.getOperand(1) != N1.getOperand(1))
+        return false;
+      SDLoc DL(N1->getOperand(0));
+      SDValue Zero = CurDAG->getRegister(AArch64::XZR, N1->getOperand(0).getValueType());
+      R1 = N1->getOperand(0);
+      R2 = Zero;
+    }
 
     APInt ShlAmt, ShrAmt;
     if (!ISD::isConstantSplatVector(N0.getOperand(2).getNode(), ShlAmt) ||
@@ -4574,7 +4581,7 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
     SDValue Imm =
         CurDAG->getTargetConstant(ShrAmt.getZExtValue(), DL, MVT::i32);
 
-    SDValue Ops[] = {XOR.getOperand(0), XOR.getOperand(1), Imm};
+    SDValue Ops[] = {R1, R2, Imm};
     if (auto Opc = SelectOpcodeFromVT<SelectTypeKind::Int>(
             VT, {AArch64::XAR_ZZZI_B, AArch64::XAR_ZZZI_H, AArch64::XAR_ZZZI_S,
                  AArch64::XAR_ZZZI_D})) {
@@ -4591,13 +4598,21 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
       N1->getOpcode() != AArch64ISD::VLSHR)
     return false;
 
-  if (N0->getOperand(0) != N1->getOperand(0) ||
-      N1->getOperand(0)->getOpcode() != ISD::XOR)
+
+  if (N0->getOperand(0) != N1->getOperand(0))
     return false;
 
-  SDValue XOR = N0.getOperand(0);
-  SDValue R1 = XOR.getOperand(0);
-  SDValue R2 = XOR.getOperand(1);
+  SDValue R1, R2;
+  if (N1->getOperand(0)->getOpcode() != ISD::XOR) {
+    SDLoc DL(N1->getOperand(0));
+    SDValue Zero = CurDAG->getRegister(AArch64::XZR, N1->getOperand(0).getValueType());
+    R1 = N1->getOperand(0);
+    R2 = Zero;
+  } else {
+    SDValue XOR = N0.getOperand(0);
+    R1 = XOR.getOperand(0);
+    R2 = XOR.getOperand(1);
+  }
 
   unsigned HsAmt = N0.getConstantOperandVal(1);
   unsigned ShAmt = N1.getConstantOperandVal(1);

Rajveer100 · 2025-04-28T13:17:51Z

I currently face this assertion locally (Apple Silicon) when reproducing the original snippet after the change:

Assertion failed: (*(AsmStrsvreg+RegAsmOffsetvreg[RegNo-1]) && "Invalid alt name index for register!"), function getRegisterName, file AArch64GenAsmWriter.inc, line 24199.

github-actions · 2025-04-28T13:18:54Z

✅ With the latest revision this PR passed the C/C++ code formatter.

davemgreen · 2025-04-30T20:09:10Z

This could to with some extra tests for fixed and scalable vectors, and to make sure the existing tests work. You might need to be careful about how the zero gets generated, I think it might need to generate a MOVIv2d_ns 0, but there are several ways to specify a zero vector and many of them are equivalent.

Rajveer100 · 2025-05-01T11:54:42Z

For scalable vectors, which instruction do we want to use among LDxxx, MOVAZxxx, MOVPRFXxxx, DUPv2i64lane and many others since MOVIv2d_ns (and MOVIxxx) is for fixed size vectors?

davemgreen · 2025-05-01T13:19:18Z

MOVIv2d_ns will actually set all the upper bits too, so it can use MOVIv2d_ns and an SUBREG_TO_REG. Something like SUBREG_TO_REG 0, Vn, zsub will tell the compiler that the fp128 is extended to a zreg and that the top bits are zeroes.

Rajveer100 · 2025-05-03T12:59:23Z

MOVIv2d_ns will actually set all the upper bits too, so it can use MOVIv2d_ns and an SUBREG_TO_REG. Something like SUBREG_TO_REG 0, Vn, zsub will tell the compiler that the fp128 is extended to a zreg and that the top bits are zeroes.

I am probably not doing it the right way, pushed changes for review.

davemgreen

I think this is looking OK. Can you add some tests for fixed-width and scalable types? Some of the fixed-width sizes do no have an instruction they can use, unless they start to use SVE instructions.

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Rajveer100 · 2025-05-06T08:52:30Z

Updated the failing tests, will add new tests.

davemgreen

Updated the failing tests, will add new tests.

update_llc_test_checks.py can update the check lines, so long as the output is correct.

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

davemgreen

Thanks - this looks good. Can you add tests like xar_instead_of_or but with types <4 x i32>, <8 x i16> and <16 x i8> to llvm/test/CodeGen/AArch64/xar.ll, and tests like xar_nxv2i64_l_neg2 but with types <vscale x 4 x i32>, <vscale x 8 x i16> and <vscale x 16 x i8> to llvm/test/CodeGen/AArch64/sve2-xar.ll. Not all of them are expected to transform, but we should make sure we test all the combos.

Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.

Rajveer100 · 2025-05-08T11:10:20Z

I have added the additional tests.

Rajveer100 · 2025-05-09T08:53:21Z

Test failure (lldb) seems unrelated to the changes.

davemgreen

Thanks. LGTM

Rajveer100 · 2025-05-09T08:55:55Z

Could you land this for me, don't have commit access?!

davemgreen · 2025-05-09T08:55:55Z

Are you happy for this to be submitted, with the icloud.com email address?

Rajveer100 · 2025-05-09T08:57:13Z

All my previous PRs have been submitted with this, so cool :)

llvmbot added the backend:AArch64 label Apr 28, 2025

Rajveer100 force-pushed the xar-vector-rotate branch 4 times, most recently from ae842ad to c8e32e0 Compare April 30, 2025 13:14

Rajveer100 force-pushed the xar-vector-rotate branch from c8e32e0 to 57329e9 Compare May 1, 2025 11:59

Rajveer100 force-pushed the xar-vector-rotate branch from 57329e9 to c14ca26 Compare May 3, 2025 12:58

davemgreen reviewed May 5, 2025

View reviewed changes

Rajveer100 force-pushed the xar-vector-rotate branch from c14ca26 to 9c2648d Compare May 6, 2025 08:51

davemgreen reviewed May 7, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Outdated Show resolved Hide resolved

davemgreen requested review from nasherm, igogo-x86 and sdesmalen-arm May 7, 2025 07:47

Rajveer100 force-pushed the xar-vector-rotate branch 5 times, most recently from 27b1cd0 to 97bd765 Compare May 8, 2025 10:28

davemgreen reviewed May 8, 2025

View reviewed changes

[AArch64] Utilize XAR for certain vector rotates

a37b2b7

Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.

Rajveer100 force-pushed the xar-vector-rotate branch from 97bd765 to a37b2b7 Compare May 8, 2025 11:09

davemgreen approved these changes May 9, 2025

View reviewed changes

davemgreen merged commit 36bb17a into llvm:main May 9, 2025
9 of 11 checks passed

[AArch64] Utilize XAR for certain vector rotates #137629

[AArch64] Utilize XAR for certain vector rotates #137629

Uh oh!

Conversation

Rajveer100 commented Apr 28, 2025

Uh oh!

llvmbot commented Apr 28, 2025

Uh oh!

Rajveer100 commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Apr 30, 2025

Uh oh!

Rajveer100 commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented May 1, 2025

Uh oh!

Rajveer100 commented May 3, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rajveer100 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Rajveer100 commented May 8, 2025

Uh oh!

Rajveer100 commented May 9, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Rajveer100 commented May 9, 2025

Uh oh!

davemgreen commented May 9, 2025

Uh oh!

Rajveer100 commented May 9, 2025

Uh oh!

Uh oh!

Uh oh!

[AArch64] Utilize `XAR` for certain vector rotates #137629

[AArch64] Utilize `XAR` for certain vector rotates #137629

Rajveer100 commented Apr 28, 2025 •

edited

Loading

github-actions bot commented Apr 28, 2025 •

edited

Loading

Rajveer100 commented May 1, 2025 •

edited

Loading

Rajveer100 commented May 6, 2025 •

edited

Loading