-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[NVPTX] Implement isTruncateFree(EVT FromVT, EVT ToVT)
#138605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request improves the target lowering for NVPTX by updating the truncation-free checks to more accurately model the hardware behavior for both LLVM IR types and EVT types.
- Modifies the integer type checks in isTruncateFree for both Type* and EVT overloads.
- Updates the bit-size conditions and removes the outdated comment regarding 64-to-32-bit truncation in SASS.
Files not reviewed (1)
- llvm/test/CodeGen/NVPTX/i128-array.ll: Language not supported
@llvm/pr-subscribers-backend-nvptx Author: Justin Fargnoli (justinfargnoli) ChangesThis PR also makes NFC changes to Full diff: https://github.com/llvm/llvm-project/pull/138605.diff 2 Files Affected:
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
index 7a8bf3bf33a94..680ff13d8f936 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
@@ -155,11 +155,19 @@ class NVPTXTargetLowering : public TargetLowering {
Instruction *I = nullptr) const override;
bool isTruncateFree(Type *SrcTy, Type *DstTy) const override {
- // Truncating 64-bit to 32-bit is free in SASS.
- if (!SrcTy->isIntegerTy() || !DstTy->isIntegerTy())
+ if (!(SrcTy->isIntegerTy() && DstTy->isIntegerTy()))
return false;
- return SrcTy->getPrimitiveSizeInBits() == 64 &&
- DstTy->getPrimitiveSizeInBits() == 32;
+ if (SrcTy->getPrimitiveSizeInBits() <= DstTy->getPrimitiveSizeInBits())
+ return false;
+ return DstTy->getPrimitiveSizeInBits() % 32 == 0;
+ }
+
+ bool isTruncateFree(EVT FromVT, EVT ToVT) const override {
+ if (!(FromVT.isScalarInteger() && ToVT.isScalarInteger()))
+ return false;
+ if (FromVT.getSizeInBits() <= ToVT.getSizeInBits())
+ return false;
+ return ToVT.getSizeInBits() % 32 == 0;
}
EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Ctx,
diff --git a/llvm/test/CodeGen/NVPTX/i128-array.ll b/llvm/test/CodeGen/NVPTX/i128-array.ll
index dd6d48bd5862c..f25d451590bed 100644
--- a/llvm/test/CodeGen/NVPTX/i128-array.ll
+++ b/llvm/test/CodeGen/NVPTX/i128-array.ll
@@ -8,13 +8,13 @@ define [2 x i128] @foo(i64 %a, i32 %b) {
; CHECK-NEXT: .reg .b64 %rd<5>;
; CHECK-EMPTY:
; CHECK-NEXT: // %bb.0:
-; CHECK-NEXT: ld.param.u32 %r1, [foo_param_1];
; CHECK-NEXT: ld.param.u64 %rd1, [foo_param_0];
-; CHECK-NEXT: shr.s64 %rd2, %rd1, 63;
-; CHECK-NEXT: cvt.s64.s32 %rd3, %r1;
-; CHECK-NEXT: shr.s64 %rd4, %rd3, 63;
-; CHECK-NEXT: st.param.v2.b64 [func_retval0], {%rd1, %rd2};
-; CHECK-NEXT: st.param.v2.b64 [func_retval0+16], {%rd3, %rd4};
+; CHECK-NEXT: ld.param.s32 %rd2, [foo_param_1];
+; CHECK-NEXT: cvt.u32.u64 %r1, %rd2;
+; CHECK-NEXT: shr.s64 %rd3, %rd1, 63;
+; CHECK-NEXT: shr.s64 %rd4, %rd2, 63;
+; CHECK-NEXT: st.param.v2.b64 [func_retval0], {%rd1, %rd3};
+; CHECK-NEXT: st.param.v2.b64 [func_retval0+16], {%rd2, %rd4};
; CHECK-NEXT: ret;
%1 = sext i64 %a to i128
%2 = sext i32 %b to i128
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to include some tests that demonstrate positive effects on code-gen as a result of this change? The only impact currently seems to be some re-ordering.
if (!SrcTy->isIntegerTy() || !DstTy->isIntegerTy()) | ||
if (!(SrcTy->isIntegerTy() && DstTy->isIntegerTy())) | ||
return false; | ||
return SrcTy->getPrimitiveSizeInBits() == 64 && | ||
DstTy->getPrimitiveSizeInBits() == 32; | ||
if (SrcTy->getPrimitiveSizeInBits() <= DstTy->getPrimitiveSizeInBits()) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be valid to call isTruncateFree
if either of these conditions were not already met?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so. The second condition is explicitly mentioned in
/// Targets must return false when FromTy <= ToTy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. What about the check for isScalarInteger
? If the vector element sizes meet the criteria for being free won't the eventual expansion be free? Do we ever expect to see non integer types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good point. When expressed in PTX, the vectors become registers and thus do not guarantee contiguousness.
if (!(FromVT.isScalarInteger() && ToVT.isScalarInteger())) | ||
return false; | ||
if (FromVT.getSizeInBits() <= ToVT.getSizeInBits()) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
@@ -155,11 +155,19 @@ class NVPTXTargetLowering : public TargetLowering { | |||
Instruction *I = nullptr) const override; | |||
|
|||
bool isTruncateFree(Type *SrcTy, Type *DstTy) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like Hexagon is the only target that does it this way, but seems simpler:
return isTruncateFree(EVT::getEVT(Ty1), EVT::getEVT(Ty2)); |
if (!SrcTy->isIntegerTy() || !DstTy->isIntegerTy()) | ||
if (!(SrcTy->isIntegerTy() && DstTy->isIntegerTy())) | ||
return false; | ||
return SrcTy->getPrimitiveSizeInBits() == 64 && | ||
DstTy->getPrimitiveSizeInBits() == 32; | ||
if (SrcTy->getPrimitiveSizeInBits() <= DstTy->getPrimitiveSizeInBits()) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so. The second condition is explicitly mentioned in
/// Targets must return false when FromTy <= ToTy. |
This PR also makes NFC changes to
isTruncateFree(Type *SrcTy, Type *DstTy)
so that it models the HW more accurately.Fixes #114339