-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[clang][SPARC] Pass 16-aligned structs with the correct alignment in CC #155829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This coerces 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes. This should make it compliant with the ABI specification and fix llvm#144709.
@beetrees does this help? |
@llvm/pr-subscribers-backend-sparc @llvm/pr-subscribers-clang-codegen Author: Koakuma (koachan) ChangesThis coerces 9 to 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes. This should make clang compliant with the ABI specification and fix #144709. Full diff: https://github.com/llvm/llvm-project/pull/155829.diff 4 Files Affected:
diff --git a/clang/lib/CodeGen/Targets/Sparc.cpp b/clang/lib/CodeGen/Targets/Sparc.cpp
index 5f3c15d106eb6..b6c8fdcfe29b6 100644
--- a/clang/lib/CodeGen/Targets/Sparc.cpp
+++ b/clang/lib/CodeGen/Targets/Sparc.cpp
@@ -228,6 +228,7 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
return ABIArgInfo::getIgnore();
uint64_t Size = getContext().getTypeSize(Ty);
+ unsigned Alignment = getContext().getTypeAlign(Ty);
// Anything too big to fit in registers is passed with an explicit indirect
// pointer / sret pointer.
@@ -275,10 +276,14 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
// Try to use the original type for coercion.
llvm::Type *CoerceTy = CB.isUsableType(StrTy) ? StrTy : CB.getType();
+ // We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment.
+ // For 9-16 byte aggregates with 16 byte alignment, we use i128.
+ llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128);
+ bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128);
+
if (CB.InReg)
- return ABIArgInfo::getDirectInReg(CoerceTy);
- else
- return ABIArgInfo::getDirect(CoerceTy);
+ return ABIArgInfo::getDirectInReg(UseI128 ? WideTy : CoerceTy);
+ return ABIArgInfo::getDirect(UseI128 ? WideTy : CoerceTy);
}
RValue SparcV9ABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
diff --git a/clang/test/CodeGen/sparcv9-abi.c b/clang/test/CodeGen/sparcv9-abi.c
index 5a3d64fd37889..9819d07425274 100644
--- a/clang/test/CodeGen/sparcv9-abi.c
+++ b/clang/test/CodeGen/sparcv9-abi.c
@@ -25,12 +25,21 @@ long double f_ld(long double x) { return x; }
struct empty {};
struct emptyarr { struct empty a[10]; };
+// 16-byte structs with 16-byte alignment gets passed as if i128.
+struct align16 { _Alignas(16) int x; };
+
// CHECK-LABEL: define{{.*}} i64 @f_empty(i64 %x.coerce)
struct empty f_empty(struct empty x) { return x; }
// CHECK-LABEL: define{{.*}} i64 @f_emptyarr(i64 %x.coerce)
struct empty f_emptyarr(struct emptyarr x) { return x.a[0]; }
+// CHECK-LABEL: define{{.*}} void @f_aligncaller(i128 %a.coerce)
+void f_aligncallee(int pad, struct align16 a);
+void f_aligncaller(struct align16 a) {
+ f_aligncallee(0, a);
+}
+
// CHECK-LABEL: define{{.*}} i64 @f_emptyvar(i32 noundef zeroext %count, ...)
long f_emptyvar(unsigned count, ...) {
long ret;
diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
index d01218f573dc2..e5ed9d267afed 100644
--- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp
+++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
@@ -115,7 +115,8 @@ static bool Analyze_CC_Sparc64_Full(bool IsReturn, unsigned &ValNo, MVT &ValVT,
// Stack space is allocated for all arguments starting from [%fp+BIAS+128].
unsigned size = (LocVT == MVT::f128) ? 16 : 8;
- Align alignment = (LocVT == MVT::f128) ? Align(16) : Align(8);
+ Align alignment =
+ (LocVT == MVT::f128 || ArgFlags.isSplit()) ? Align(16) : Align(8);
unsigned Offset = State.AllocateStack(size, alignment);
unsigned Reg = 0;
diff --git a/llvm/test/CodeGen/SPARC/64abi.ll b/llvm/test/CodeGen/SPARC/64abi.ll
index 6485a7f13e8d5..dc8c9af4a5185 100644
--- a/llvm/test/CodeGen/SPARC/64abi.ll
+++ b/llvm/test/CodeGen/SPARC/64abi.ll
@@ -473,8 +473,8 @@ declare i64 @receive_fp128(i64 %a, ...)
; HARD-DAG: ldx [%sp+[[Offset0]]], %o2
; HARD-DAG: ldx [%sp+[[Offset1]]], %o3
; SOFT-DAG: mov %i0, %o0
-; SOFT-DAG: mov %i1, %o1
; SOFT-DAG: mov %i2, %o2
+; SOFT-DAG: mov %i3, %o3
; CHECK: call receive_fp128
define i64 @test_fp128_variable_args(i64 %a, fp128 %b) {
entry:
@@ -482,6 +482,19 @@ entry:
ret i64 %0
}
+declare i64 @receive_i128(i64 %a, i128 %b)
+
+; CHECK-LABEL: test_i128_args:
+; CHECK: mov %i3, %o3
+; CHECK: mov %i2, %o2
+; CHECK: mov %i0, %o0
+; CHECK: call receive_i128
+define i64 @test_i128_args(i64 %a, i128 %b) {
+entry:
+ %0 = call i64 @receive_i128(i64 %a, i128 %b)
+ ret i64 %0
+}
+
; CHECK-LABEL: test_call_libfunc:
; HARD: st %f1, [%fp+[[Offset0:[0-9]+]]]
; HARD: fmovs %f3, %f1
|
@llvm/pr-subscribers-clang Author: Koakuma (koachan) ChangesThis coerces 9 to 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes. This should make clang compliant with the ABI specification and fix #144709. Full diff: https://github.com/llvm/llvm-project/pull/155829.diff 4 Files Affected:
diff --git a/clang/lib/CodeGen/Targets/Sparc.cpp b/clang/lib/CodeGen/Targets/Sparc.cpp
index 5f3c15d106eb6..b6c8fdcfe29b6 100644
--- a/clang/lib/CodeGen/Targets/Sparc.cpp
+++ b/clang/lib/CodeGen/Targets/Sparc.cpp
@@ -228,6 +228,7 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
return ABIArgInfo::getIgnore();
uint64_t Size = getContext().getTypeSize(Ty);
+ unsigned Alignment = getContext().getTypeAlign(Ty);
// Anything too big to fit in registers is passed with an explicit indirect
// pointer / sret pointer.
@@ -275,10 +276,14 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
// Try to use the original type for coercion.
llvm::Type *CoerceTy = CB.isUsableType(StrTy) ? StrTy : CB.getType();
+ // We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment.
+ // For 9-16 byte aggregates with 16 byte alignment, we use i128.
+ llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128);
+ bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128);
+
if (CB.InReg)
- return ABIArgInfo::getDirectInReg(CoerceTy);
- else
- return ABIArgInfo::getDirect(CoerceTy);
+ return ABIArgInfo::getDirectInReg(UseI128 ? WideTy : CoerceTy);
+ return ABIArgInfo::getDirect(UseI128 ? WideTy : CoerceTy);
}
RValue SparcV9ABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
diff --git a/clang/test/CodeGen/sparcv9-abi.c b/clang/test/CodeGen/sparcv9-abi.c
index 5a3d64fd37889..9819d07425274 100644
--- a/clang/test/CodeGen/sparcv9-abi.c
+++ b/clang/test/CodeGen/sparcv9-abi.c
@@ -25,12 +25,21 @@ long double f_ld(long double x) { return x; }
struct empty {};
struct emptyarr { struct empty a[10]; };
+// 16-byte structs with 16-byte alignment gets passed as if i128.
+struct align16 { _Alignas(16) int x; };
+
// CHECK-LABEL: define{{.*}} i64 @f_empty(i64 %x.coerce)
struct empty f_empty(struct empty x) { return x; }
// CHECK-LABEL: define{{.*}} i64 @f_emptyarr(i64 %x.coerce)
struct empty f_emptyarr(struct emptyarr x) { return x.a[0]; }
+// CHECK-LABEL: define{{.*}} void @f_aligncaller(i128 %a.coerce)
+void f_aligncallee(int pad, struct align16 a);
+void f_aligncaller(struct align16 a) {
+ f_aligncallee(0, a);
+}
+
// CHECK-LABEL: define{{.*}} i64 @f_emptyvar(i32 noundef zeroext %count, ...)
long f_emptyvar(unsigned count, ...) {
long ret;
diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
index d01218f573dc2..e5ed9d267afed 100644
--- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp
+++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
@@ -115,7 +115,8 @@ static bool Analyze_CC_Sparc64_Full(bool IsReturn, unsigned &ValNo, MVT &ValVT,
// Stack space is allocated for all arguments starting from [%fp+BIAS+128].
unsigned size = (LocVT == MVT::f128) ? 16 : 8;
- Align alignment = (LocVT == MVT::f128) ? Align(16) : Align(8);
+ Align alignment =
+ (LocVT == MVT::f128 || ArgFlags.isSplit()) ? Align(16) : Align(8);
unsigned Offset = State.AllocateStack(size, alignment);
unsigned Reg = 0;
diff --git a/llvm/test/CodeGen/SPARC/64abi.ll b/llvm/test/CodeGen/SPARC/64abi.ll
index 6485a7f13e8d5..dc8c9af4a5185 100644
--- a/llvm/test/CodeGen/SPARC/64abi.ll
+++ b/llvm/test/CodeGen/SPARC/64abi.ll
@@ -473,8 +473,8 @@ declare i64 @receive_fp128(i64 %a, ...)
; HARD-DAG: ldx [%sp+[[Offset0]]], %o2
; HARD-DAG: ldx [%sp+[[Offset1]]], %o3
; SOFT-DAG: mov %i0, %o0
-; SOFT-DAG: mov %i1, %o1
; SOFT-DAG: mov %i2, %o2
+; SOFT-DAG: mov %i3, %o3
; CHECK: call receive_fp128
define i64 @test_fp128_variable_args(i64 %a, fp128 %b) {
entry:
@@ -482,6 +482,19 @@ entry:
ret i64 %0
}
+declare i64 @receive_i128(i64 %a, i128 %b)
+
+; CHECK-LABEL: test_i128_args:
+; CHECK: mov %i3, %o3
+; CHECK: mov %i2, %o2
+; CHECK: mov %i0, %o0
+; CHECK: call receive_i128
+define i64 @test_i128_args(i64 %a, i128 %b) {
+entry:
+ %0 = call i64 @receive_i128(i64 %a, i128 %b)
+ ret i64 %0
+}
+
; CHECK-LABEL: test_call_libfunc:
; HARD: st %f1, [%fp+[[Offset0:[0-9]+]]]
; HARD: fmovs %f3, %f1
|
clang/lib/CodeGen/Targets/Sparc.cpp
Outdated
// We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment. | ||
// For 9-16 byte aggregates with 16 byte alignment, we use i128. | ||
llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128); | ||
bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the size is bigger than 16? Not sure how much it matters, since SizeLimit is 128 for arguments, but we should have some test coverage for return types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structs 17-32 bytes are passed indirectly, but returned in registers; structs over 32 bytes are also returned indirectly.
Structures or unions larger than sixteen bytes are copied by the caller and passed indirectly; the caller will pass the address of a correctly aligned structure value.
Structure and union return types up to thirty-two bytes in size are returned in registers. [...] For types with a larger size the caller allocates an area large enough and aligned properly to hold the return value, and passes a pointer to that area as an implicit first argument (of type pointer-to-data) to the callee.
I think for that case it's already tested in sparcv9-abi.c as the struct medium
and struct large
cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add coverage for a "medium" struct with 128-bit alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, as medium_aligned
~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for working on this. This PR correctly fixes the example given in #144709. However, this PR currently fails to correctly pass overaligned structs containing floating point fields. Consider the following example: struct Struct {
_Alignas(16) long x;
double y;
};
double f(long a, struct Struct b) {
return b.y;
} GCC will correctly pass This PR also breaks the ABI of struct Struct {
long double x;
};
long double f(long a, struct Struct b) {
return b.x;
} GCC and current Clang will pass |
Hmmm yeah, looks like the PR mistakenly passes FPs as integers. |
Okay, revised the implementation. Now it should properly do things @beetrees @efriedma-quic |
|
||
void SparcV9ABIInfo::computeInfo(CGFunctionInfo &FI) const { | ||
FI.getReturnInfo() = classifyType(FI.getReturnType(), 32 * 8); | ||
unsigned RetOffset = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do ArgOffset and RetOffset have to be connected somehow? Returns can be lowered to an argument, which takes a register. Not sure if that register is an argument register in the SPARC calling convention; if it isn't, this is fine, I guess.
Are the padding rules the same on the stack as they are in registers, for functions that take many arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof yeah, if a return is converted to a pointer argument it'll take an argument register...
So in that specific case the argument and return offsets should be connected, yes.
As for the latter, stack arguments have the same padding (or rather, alignment) requirements as they do in registers; there has to be a 1:1 correspondence between registers and stack memory locations.
clang/lib/CodeGen/Targets/Sparc.cpp
Outdated
QualType Ty, AggValueSlot Slot) const { | ||
ABIArgInfo AI = classifyType(Ty, 16 * 8); | ||
unsigned ArgOffset = 0; | ||
ABIArgInfo AI = classifyType(Ty, 16 * 8, ArgOffset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to align the address of the va_arg pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though pointers are 64-bit wide and we're already allocating argument registers in 64-bit chunks anyway so it should be aligned properly already.
If what you mean is the contents of the va_list itself, then yes it needs to be aligned too, but I think that's the responsibility of the stack allocator instead of here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part I'm questioning is, do we need to skip over padding for arguments with 128-bit alignment?
The list as a whole should have 128-bit alignment, sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, yeah, we do need to put alignment paddings too.
clang/test/CodeGen/sparcv9-abi.c
Outdated
// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap | ||
// CHECK-DAG: %[[NXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8 | ||
// CHECK-DAG: store ptr %[[NXT]], ptr %ap | ||
// CHECK-DAG: %[[ADR:[^ ]+]] = load ptr, ptr %[[CUR]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explicitly check for the llvm.ptrmask call (which I assume is getting generated?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, thanks for noticing that, I think I made a mistake in writing the test case.
In any case, should be fixed now~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@koachan Can you get this in? |
…CC (llvm#155829) Pad argument registers to preserve overaligned structs in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes. This should make clang compliant with the ABI specification and fix llvm#144709.
Pad argument registers to preserve overaligned structs in LLVM IR.
Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes.
This should make clang compliant with the ABI specification and fix #144709.