-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[AArch64] Utilize XAR
for certain vector rotates
#137629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: Rajveer Singh Bharadwaj (Rajveer100) ChangesResolves #137162 For cases when there isn't any Full diff: https://github.com/llvm/llvm-project/pull/137629.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 40944e3d43d6b..b0559692331d8 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -13,6 +13,7 @@
#include "AArch64MachineFunctionInfo.h"
#include "AArch64TargetMachine.h"
#include "MCTargetDesc/AArch64AddressingModes.h"
+#include "MCTargetDesc/AArch64MCTargetDesc.h"
#include "llvm/ADT/APSInt.h"
#include "llvm/CodeGen/ISDOpcodes.h"
#include "llvm/CodeGen/SelectionDAGISel.h"
@@ -4558,9 +4559,15 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
!TLI->isAllActivePredicate(*CurDAG, N1.getOperand(0)))
return false;
- SDValue XOR = N0.getOperand(1);
- if (XOR.getOpcode() != ISD::XOR || XOR != N1.getOperand(1))
- return false;
+ SDValue R1, R2;
+ if (N0.getOperand(1).getOpcode() != ISD::XOR) {
+ if (N0.getOperand(1) != N1.getOperand(1))
+ return false;
+ SDLoc DL(N1->getOperand(0));
+ SDValue Zero = CurDAG->getRegister(AArch64::XZR, N1->getOperand(0).getValueType());
+ R1 = N1->getOperand(0);
+ R2 = Zero;
+ }
APInt ShlAmt, ShrAmt;
if (!ISD::isConstantSplatVector(N0.getOperand(2).getNode(), ShlAmt) ||
@@ -4574,7 +4581,7 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
SDValue Imm =
CurDAG->getTargetConstant(ShrAmt.getZExtValue(), DL, MVT::i32);
- SDValue Ops[] = {XOR.getOperand(0), XOR.getOperand(1), Imm};
+ SDValue Ops[] = {R1, R2, Imm};
if (auto Opc = SelectOpcodeFromVT<SelectTypeKind::Int>(
VT, {AArch64::XAR_ZZZI_B, AArch64::XAR_ZZZI_H, AArch64::XAR_ZZZI_S,
AArch64::XAR_ZZZI_D})) {
@@ -4591,13 +4598,21 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
N1->getOpcode() != AArch64ISD::VLSHR)
return false;
- if (N0->getOperand(0) != N1->getOperand(0) ||
- N1->getOperand(0)->getOpcode() != ISD::XOR)
+
+ if (N0->getOperand(0) != N1->getOperand(0))
return false;
- SDValue XOR = N0.getOperand(0);
- SDValue R1 = XOR.getOperand(0);
- SDValue R2 = XOR.getOperand(1);
+ SDValue R1, R2;
+ if (N1->getOperand(0)->getOpcode() != ISD::XOR) {
+ SDLoc DL(N1->getOperand(0));
+ SDValue Zero = CurDAG->getRegister(AArch64::XZR, N1->getOperand(0).getValueType());
+ R1 = N1->getOperand(0);
+ R2 = Zero;
+ } else {
+ SDValue XOR = N0.getOperand(0);
+ R1 = XOR.getOperand(0);
+ R2 = XOR.getOperand(1);
+ }
unsigned HsAmt = N0.getConstantOperandVal(1);
unsigned ShAmt = N1.getConstantOperandVal(1);
|
I currently face this assertion locally (Apple Silicon) when reproducing the original snippet after the change: Assertion failed: (*(AsmStrsvreg+RegAsmOffsetvreg[RegNo-1]) && "Invalid alt name index for register!"), function getRegisterName, file AArch64GenAsmWriter.inc, line 24199. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
ae842ad
to
c8e32e0
Compare
This could to with some extra tests for fixed and scalable vectors, and to make sure the existing tests work. You might need to be careful about how the zero gets generated, I think it might need to generate a |
For scalable vectors, which instruction do we want to use among |
c8e32e0
to
57329e9
Compare
MOVIv2d_ns will actually set all the upper bits too, so it can use MOVIv2d_ns and an SUBREG_TO_REG. Something like |
57329e9
to
c14ca26
Compare
I am probably not doing it the right way, pushed changes for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking OK. Can you add some tests for fixed-width and scalable types? Some of the fixed-width sizes do no have an instruction they can use, unless they start to use SVE instructions.
c14ca26
to
9c2648d
Compare
Updated the failing tests, will add new tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the failing tests, will add new tests.
update_llc_test_checks.py can update the check lines, so long as the output is correct.
eb4a062
to
6861dce
Compare
Resolves llvm#137162 For cases when there isn't any `XOR` in the transformation, replace with a zero register.
6861dce
to
082ab05
Compare
Resolves #137162
For cases when there isn't any
XOR
in the transformation, replace with a zero register.