Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

koachan
Copy link
Contributor

@koachan koachan commented Aug 28, 2025

Pad argument registers to preserve overaligned structs in LLVM IR.
Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes.

This should make clang compliant with the ABI specification and fix #144709.

This coerces 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR.
Additionally, since i128 values will be lowered as split i64 pairs in the backend,
correctly set the alignment of such arguments as 16 bytes.

This should make it compliant with the ABI specification and fix llvm#144709.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:Sparc clang:codegen IR generation bugs: mangling, exceptions, etc. labels Aug 28, 2025
@koachan
Copy link
Contributor Author

koachan commented Aug 28, 2025

@beetrees does this help?

@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2025

@llvm/pr-subscribers-backend-sparc

@llvm/pr-subscribers-clang-codegen

Author: Koakuma (koachan)

Changes

This coerces 9 to 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes.

This should make clang compliant with the ABI specification and fix #144709.


Full diff: https://github.com/llvm/llvm-project/pull/155829.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/Sparc.cpp (+8-3)
  • (modified) clang/test/CodeGen/sparcv9-abi.c (+9)
  • (modified) llvm/lib/Target/Sparc/SparcISelLowering.cpp (+2-1)
  • (modified) llvm/test/CodeGen/SPARC/64abi.ll (+14-1)
diff --git a/clang/lib/CodeGen/Targets/Sparc.cpp b/clang/lib/CodeGen/Targets/Sparc.cpp
index 5f3c15d106eb6..b6c8fdcfe29b6 100644
--- a/clang/lib/CodeGen/Targets/Sparc.cpp
+++ b/clang/lib/CodeGen/Targets/Sparc.cpp
@@ -228,6 +228,7 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
     return ABIArgInfo::getIgnore();
 
   uint64_t Size = getContext().getTypeSize(Ty);
+  unsigned Alignment = getContext().getTypeAlign(Ty);
 
   // Anything too big to fit in registers is passed with an explicit indirect
   // pointer / sret pointer.
@@ -275,10 +276,14 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
   // Try to use the original type for coercion.
   llvm::Type *CoerceTy = CB.isUsableType(StrTy) ? StrTy : CB.getType();
 
+  // We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment.
+  // For 9-16 byte aggregates with 16 byte alignment, we use i128.
+  llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128);
+  bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128);
+
   if (CB.InReg)
-    return ABIArgInfo::getDirectInReg(CoerceTy);
-  else
-    return ABIArgInfo::getDirect(CoerceTy);
+    return ABIArgInfo::getDirectInReg(UseI128 ? WideTy : CoerceTy);
+  return ABIArgInfo::getDirect(UseI128 ? WideTy : CoerceTy);
 }
 
 RValue SparcV9ABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
diff --git a/clang/test/CodeGen/sparcv9-abi.c b/clang/test/CodeGen/sparcv9-abi.c
index 5a3d64fd37889..9819d07425274 100644
--- a/clang/test/CodeGen/sparcv9-abi.c
+++ b/clang/test/CodeGen/sparcv9-abi.c
@@ -25,12 +25,21 @@ long double f_ld(long double x) { return x; }
 struct empty {};
 struct emptyarr { struct empty a[10]; };
 
+// 16-byte structs with 16-byte alignment gets passed as if i128.
+struct align16 { _Alignas(16) int x; };
+
 // CHECK-LABEL: define{{.*}} i64 @f_empty(i64 %x.coerce)
 struct empty f_empty(struct empty x) { return x; }
 
 // CHECK-LABEL: define{{.*}} i64 @f_emptyarr(i64 %x.coerce)
 struct empty f_emptyarr(struct emptyarr x) { return x.a[0]; }
 
+// CHECK-LABEL: define{{.*}} void @f_aligncaller(i128 %a.coerce)
+void f_aligncallee(int pad, struct align16 a);
+void f_aligncaller(struct align16 a) {
+    f_aligncallee(0, a);
+}
+
 // CHECK-LABEL: define{{.*}} i64 @f_emptyvar(i32 noundef zeroext %count, ...)
 long f_emptyvar(unsigned count, ...) {
     long ret;
diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
index d01218f573dc2..e5ed9d267afed 100644
--- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp
+++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
@@ -115,7 +115,8 @@ static bool Analyze_CC_Sparc64_Full(bool IsReturn, unsigned &ValNo, MVT &ValVT,
 
   // Stack space is allocated for all arguments starting from [%fp+BIAS+128].
   unsigned size      = (LocVT == MVT::f128) ? 16 : 8;
-  Align alignment = (LocVT == MVT::f128) ? Align(16) : Align(8);
+  Align alignment =
+      (LocVT == MVT::f128 || ArgFlags.isSplit()) ? Align(16) : Align(8);
   unsigned Offset = State.AllocateStack(size, alignment);
   unsigned Reg = 0;
 
diff --git a/llvm/test/CodeGen/SPARC/64abi.ll b/llvm/test/CodeGen/SPARC/64abi.ll
index 6485a7f13e8d5..dc8c9af4a5185 100644
--- a/llvm/test/CodeGen/SPARC/64abi.ll
+++ b/llvm/test/CodeGen/SPARC/64abi.ll
@@ -473,8 +473,8 @@ declare i64 @receive_fp128(i64 %a, ...)
 ; HARD-DAG:   ldx [%sp+[[Offset0]]], %o2
 ; HARD-DAG:   ldx [%sp+[[Offset1]]], %o3
 ; SOFT-DAG:   mov  %i0, %o0
-; SOFT-DAG:   mov  %i1, %o1
 ; SOFT-DAG:   mov  %i2, %o2
+; SOFT-DAG:   mov  %i3, %o3
 ; CHECK:      call receive_fp128
 define i64 @test_fp128_variable_args(i64 %a, fp128 %b) {
 entry:
@@ -482,6 +482,19 @@ entry:
   ret i64 %0
 }
 
+declare i64 @receive_i128(i64 %a, i128 %b)
+
+; CHECK-LABEL: test_i128_args:
+; CHECK:   mov  %i3, %o3
+; CHECK:   mov  %i2, %o2
+; CHECK:   mov  %i0, %o0
+; CHECK:   call receive_i128
+define i64 @test_i128_args(i64 %a, i128 %b) {
+entry:
+  %0 = call i64 @receive_i128(i64 %a, i128 %b)
+  ret i64 %0
+}
+
 ; CHECK-LABEL: test_call_libfunc:
 ; HARD:   st %f1, [%fp+[[Offset0:[0-9]+]]]
 ; HARD:   fmovs %f3, %f1

@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2025

@llvm/pr-subscribers-clang

Author: Koakuma (koachan)

Changes

This coerces 9 to 16-byte C structs that are 16-byte aligned as an i128 in LLVM IR. Additionally, since i128 values will be lowered as split i64 pairs in the backend, correctly set the alignment of such arguments as 16 bytes.

This should make clang compliant with the ABI specification and fix #144709.


Full diff: https://github.com/llvm/llvm-project/pull/155829.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/Sparc.cpp (+8-3)
  • (modified) clang/test/CodeGen/sparcv9-abi.c (+9)
  • (modified) llvm/lib/Target/Sparc/SparcISelLowering.cpp (+2-1)
  • (modified) llvm/test/CodeGen/SPARC/64abi.ll (+14-1)
diff --git a/clang/lib/CodeGen/Targets/Sparc.cpp b/clang/lib/CodeGen/Targets/Sparc.cpp
index 5f3c15d106eb6..b6c8fdcfe29b6 100644
--- a/clang/lib/CodeGen/Targets/Sparc.cpp
+++ b/clang/lib/CodeGen/Targets/Sparc.cpp
@@ -228,6 +228,7 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
     return ABIArgInfo::getIgnore();
 
   uint64_t Size = getContext().getTypeSize(Ty);
+  unsigned Alignment = getContext().getTypeAlign(Ty);
 
   // Anything too big to fit in registers is passed with an explicit indirect
   // pointer / sret pointer.
@@ -275,10 +276,14 @@ SparcV9ABIInfo::classifyType(QualType Ty, unsigned SizeLimit) const {
   // Try to use the original type for coercion.
   llvm::Type *CoerceTy = CB.isUsableType(StrTy) ? StrTy : CB.getType();
 
+  // We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment.
+  // For 9-16 byte aggregates with 16 byte alignment, we use i128.
+  llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128);
+  bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128);
+
   if (CB.InReg)
-    return ABIArgInfo::getDirectInReg(CoerceTy);
-  else
-    return ABIArgInfo::getDirect(CoerceTy);
+    return ABIArgInfo::getDirectInReg(UseI128 ? WideTy : CoerceTy);
+  return ABIArgInfo::getDirect(UseI128 ? WideTy : CoerceTy);
 }
 
 RValue SparcV9ABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
diff --git a/clang/test/CodeGen/sparcv9-abi.c b/clang/test/CodeGen/sparcv9-abi.c
index 5a3d64fd37889..9819d07425274 100644
--- a/clang/test/CodeGen/sparcv9-abi.c
+++ b/clang/test/CodeGen/sparcv9-abi.c
@@ -25,12 +25,21 @@ long double f_ld(long double x) { return x; }
 struct empty {};
 struct emptyarr { struct empty a[10]; };
 
+// 16-byte structs with 16-byte alignment gets passed as if i128.
+struct align16 { _Alignas(16) int x; };
+
 // CHECK-LABEL: define{{.*}} i64 @f_empty(i64 %x.coerce)
 struct empty f_empty(struct empty x) { return x; }
 
 // CHECK-LABEL: define{{.*}} i64 @f_emptyarr(i64 %x.coerce)
 struct empty f_emptyarr(struct emptyarr x) { return x.a[0]; }
 
+// CHECK-LABEL: define{{.*}} void @f_aligncaller(i128 %a.coerce)
+void f_aligncallee(int pad, struct align16 a);
+void f_aligncaller(struct align16 a) {
+    f_aligncallee(0, a);
+}
+
 // CHECK-LABEL: define{{.*}} i64 @f_emptyvar(i32 noundef zeroext %count, ...)
 long f_emptyvar(unsigned count, ...) {
     long ret;
diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
index d01218f573dc2..e5ed9d267afed 100644
--- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp
+++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
@@ -115,7 +115,8 @@ static bool Analyze_CC_Sparc64_Full(bool IsReturn, unsigned &ValNo, MVT &ValVT,
 
   // Stack space is allocated for all arguments starting from [%fp+BIAS+128].
   unsigned size      = (LocVT == MVT::f128) ? 16 : 8;
-  Align alignment = (LocVT == MVT::f128) ? Align(16) : Align(8);
+  Align alignment =
+      (LocVT == MVT::f128 || ArgFlags.isSplit()) ? Align(16) : Align(8);
   unsigned Offset = State.AllocateStack(size, alignment);
   unsigned Reg = 0;
 
diff --git a/llvm/test/CodeGen/SPARC/64abi.ll b/llvm/test/CodeGen/SPARC/64abi.ll
index 6485a7f13e8d5..dc8c9af4a5185 100644
--- a/llvm/test/CodeGen/SPARC/64abi.ll
+++ b/llvm/test/CodeGen/SPARC/64abi.ll
@@ -473,8 +473,8 @@ declare i64 @receive_fp128(i64 %a, ...)
 ; HARD-DAG:   ldx [%sp+[[Offset0]]], %o2
 ; HARD-DAG:   ldx [%sp+[[Offset1]]], %o3
 ; SOFT-DAG:   mov  %i0, %o0
-; SOFT-DAG:   mov  %i1, %o1
 ; SOFT-DAG:   mov  %i2, %o2
+; SOFT-DAG:   mov  %i3, %o3
 ; CHECK:      call receive_fp128
 define i64 @test_fp128_variable_args(i64 %a, fp128 %b) {
 entry:
@@ -482,6 +482,19 @@ entry:
   ret i64 %0
 }
 
+declare i64 @receive_i128(i64 %a, i128 %b)
+
+; CHECK-LABEL: test_i128_args:
+; CHECK:   mov  %i3, %o3
+; CHECK:   mov  %i2, %o2
+; CHECK:   mov  %i0, %o0
+; CHECK:   call receive_i128
+define i64 @test_i128_args(i64 %a, i128 %b) {
+entry:
+  %0 = call i64 @receive_i128(i64 %a, i128 %b)
+  ret i64 %0
+}
+
 ; CHECK-LABEL: test_call_libfunc:
 ; HARD:   st %f1, [%fp+[[Offset0:[0-9]+]]]
 ; HARD:   fmovs %f3, %f1

// We use a pair of i64 for 9-16 byte aggregate with 8 byte alignment.
// For 9-16 byte aggregates with 16 byte alignment, we use i128.
llvm::Type *WideTy = llvm::Type::getIntNTy(getVMContext(), 128);
bool UseI128 = (Size > 64) && (Size <= 128) && (Alignment == 128);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the size is bigger than 16? Not sure how much it matters, since SizeLimit is 128 for arguments, but we should have some test coverage for return types.

Copy link
Contributor Author

@koachan koachan Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structs 17-32 bytes are passed indirectly, but returned in registers; structs over 32 bytes are also returned indirectly.

Structures or unions larger than sixteen bytes are copied by the caller and passed indirectly; the caller will pass the address of a correctly aligned structure value.

Structure and union return types up to thirty-two bytes in size are returned in registers. [...] For types with a larger size the caller allocates an area large enough and aligned properly to hold the return value, and passes a pointer to that area as an implicit first argument (of type pointer-to-data) to the callee.

I think for that case it's already tested in sparcv9-abi.c as the struct medium and struct large cases?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add coverage for a "medium" struct with 128-bit alignment.

Copy link
Contributor Author

@koachan koachan Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, as medium_aligned~

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@beetrees
Copy link
Contributor

Thanks for working on this. This PR correctly fixes the example given in #144709. However, this PR currently fails to correctly pass overaligned structs containing floating point fields. Consider the following example:

struct Struct {
	_Alignas(16) long x;
	double y;
};

double f(long a, struct Struct b) {
	return b.y;
}

GCC will correctly pass b.y in d6, whereas this PR currently incorrectly passes b.y in o3.

This PR also breaks the ABI of long double when in a struct:

struct Struct {
	long double x;
};

long double f(long a, struct Struct b) {
	return b.x;
}

GCC and current Clang will pass b.x in q4, whereas this PR currently incorrectly passes b.x in o2 and o3.

@koachan
Copy link
Contributor Author

koachan commented Sep 11, 2025

Hmmm yeah, looks like the PR mistakenly passes FPs as integers.
The ABI tests seem to be rather lacking too in testing struct with FP members...

@koachan
Copy link
Contributor Author

koachan commented Sep 14, 2025

Okay, revised the implementation. Now it should properly do things @beetrees @efriedma-quic


void SparcV9ABIInfo::computeInfo(CGFunctionInfo &FI) const {
FI.getReturnInfo() = classifyType(FI.getReturnType(), 32 * 8);
unsigned RetOffset = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do ArgOffset and RetOffset have to be connected somehow? Returns can be lowered to an argument, which takes a register. Not sure if that register is an argument register in the SPARC calling convention; if it isn't, this is fine, I guess.

Are the padding rules the same on the stack as they are in registers, for functions that take many arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof yeah, if a return is converted to a pointer argument it'll take an argument register...
So in that specific case the argument and return offsets should be connected, yes.

As for the latter, stack arguments have the same padding (or rather, alignment) requirements as they do in registers; there has to be a 1:1 correspondence between registers and stack memory locations.

QualType Ty, AggValueSlot Slot) const {
ABIArgInfo AI = classifyType(Ty, 16 * 8);
unsigned ArgOffset = 0;
ABIArgInfo AI = classifyType(Ty, 16 * 8, ArgOffset);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to align the address of the va_arg pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though pointers are 64-bit wide and we're already allocating argument registers in 64-bit chunks anyway so it should be aligned properly already.

If what you mean is the contents of the va_list itself, then yes it needs to be aligned too, but I think that's the responsibility of the stack allocator instead of here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part I'm questioning is, do we need to skip over padding for arguments with 128-bit alignment?

The list as a whole should have 128-bit alignment, sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, yeah, we do need to put alignment paddings too.

@koachan koachan changed the title [clang][SPARC] Pass 16-aligned 16-byte structs as i128 in CC [clang][SPARC] Pass 16-aligned structs with the correct alignment in CC Sep 17, 2025
// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
// CHECK-DAG: %[[NXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
// CHECK-DAG: store ptr %[[NXT]], ptr %ap
// CHECK-DAG: %[[ADR:[^ ]+]] = load ptr, ptr %[[CUR]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explicitly check for the llvm.ptrmask call (which I assume is getting generated?)

Copy link
Contributor Author

@koachan koachan Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, thanks for noticing that, I think I made a mistake in writing the test case.
In any case, should be fixed now~

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brad0
Copy link
Contributor

brad0 commented Sep 26, 2025

@koachan Can you get this in?

@koachan koachan merged commit 6679e43 into llvm:main Sep 26, 2025
9 checks passed
YixingZhang007 pushed a commit to YixingZhang007/llvm-project that referenced this pull request Sep 27, 2025
…CC (llvm#155829)

Pad argument registers to preserve overaligned structs in LLVM IR.
Additionally, since i128 values will be lowered as split i64 pairs in
the backend, correctly set the alignment of such arguments as 16 bytes.

This should make clang compliant with the ABI specification and fix
llvm#144709.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:Sparc clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clang: 64-bit SPARC doesn't align struct arguments as required by ABI
5 participants