Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[AArch64] Change the coercion type of structs with pointer members. #135064

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

davemgreen
Copy link
Collaborator

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. labels Apr 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 9, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.


Full diff: https://github.com/llvm/llvm-project/pull/135064.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/AArch64.cpp (+18)
  • (added) clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp (+93)
  • (modified) clang/test/CodeGenCXX/trivial_abi.cpp (+5-8)
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index 073ca3cc82690..9dc5f824254eb 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -485,6 +485,24 @@ ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty, bool IsVariadicFn,
     }
     Size = llvm::alignTo(Size, Alignment);
 
+    // If the Aggregate is made up of pointers, use an array of pointers for the
+    // coerced type. This prevents having to convert ptr2int->int2ptr through
+    // the call, allowing alias analysis to produce better code.
+    if (const RecordType *RT = Ty->getAs<RecordType>()) {
+      if (const RecordDecl *RD = RT->getDecl()) {
+        if (all_of(RD->fields(), [](FieldDecl *FD) {
+              return FD->getType()->isPointerOrReferenceType();
+            })) {
+          assert((Size == 64 || Size == 128) &&
+                 "Expected a 64 or 128bit struct containing pointers");
+          llvm::Type *PtrTy = llvm::PointerType::getUnqual(getVMContext());
+          if (Size == 128)
+            PtrTy = llvm::ArrayType::get(PtrTy, 2);
+          return ABIArgInfo::getDirect(PtrTy);
+        }
+      }
+    }
+
     // We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
     // For aggregates with 16-byte alignment, we use i128.
     llvm::Type *BaseTy = llvm::Type::getIntNTy(getVMContext(), Alignment);
diff --git a/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
new file mode 100644
index 0000000000000..c2d68ae7ef6cd
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
@@ -0,0 +1,93 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple aarch64-none-elf -fcxx-exceptions -fexceptions -emit-llvm -o - %s | FileCheck %s
+
+struct Sp {
+    int *x;
+};
+// CHECK-LABEL: define dso_local void @_Z2Tp2Sp(
+// CHECK-SAME: ptr [[S_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SP:%.*]], align 8
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    store ptr [[S_COERCE]], ptr [[COERCE_DIVE]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tp(Sp s) { *s.x = 1; }
+
+struct Spp {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpp3Spp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp(Spp s) { *s.x = 1; }
+
+struct Sppp {
+    int *x, *y, *z;
+};
+// CHECK-LABEL: define dso_local void @_Z4Tppp4Sppp(
+// CHECK-SAME: ptr noundef [[S:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S_INDIRECT_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[S]], ptr [[S_INDIRECT_ADDR]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPPP:%.*]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tppp(Sppp s) { *s.x = 1; }
+
+struct Spi {
+    int *x, y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpi3Spi(
+// CHECK-SAME: [2 x i64] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPI:%.*]], align 8
+// CHECK-NEXT:    store [2 x i64] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPI]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpi(Spi s) { *s.x = 1; }
+
+struct Srp {
+    int &x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Trp3Srp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SRP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SRP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Trp(Srp s) { s.x = 1; }
+
+struct __attribute__((__packed__)) Spp_packed {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z10Tpp_packed10Spp_packed(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP_PACKED:%.*]], align 1
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 1
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP_PACKED]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 1
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp_packed(Spp_packed s) { *s.x = 1; }
diff --git a/clang/test/CodeGenCXX/trivial_abi.cpp b/clang/test/CodeGenCXX/trivial_abi.cpp
index 90054dbf37ae3..b8cc0d1cc6528 100644
--- a/clang/test/CodeGenCXX/trivial_abi.cpp
+++ b/clang/test/CodeGenCXX/trivial_abi.cpp
@@ -68,11 +68,10 @@ struct D0 : B0, B1 {
 
 Small D0::m0() { return {}; }
 
-// CHECK: define{{.*}} void @_Z14testParamSmall5Small(i64 %[[A_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z14testParamSmall5Small(ptr %[[A_COERCE:.*]])
 // CHECK: %[[A:.*]] = alloca %[[STRUCT_SMALL]], align 8
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[A]], i32 0, i32 0
-// CHECK: %[[COERCE_VAL_IP:.*]] = inttoptr i64 %[[A_COERCE]] to ptr
-// CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
+// CHECK: store ptr %[[A_COERCE]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[CALL:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[A]])
 // CHECK: ret void
 // CHECK: }
@@ -101,8 +100,7 @@ Small testReturnSmall() {
 // CHECK: %[[CALL1:.*]] = call noundef ptr @_ZN5SmallC1ERKS_(ptr {{[^,]*}} %[[AGG_TMP]], ptr noundef nonnull align 8 dereferenceable(8) %[[T]])
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: %[[CALL2:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[T]])
 // CHECK: ret void
 // CHECK: }
@@ -120,8 +118,7 @@ void testCallSmall0() {
 // CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[COERCE_DIVE1:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE1]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: ret void
 // CHECK: }
 
@@ -226,7 +223,7 @@ NonTrivial testReturnHasNonTrivial() {
 // CHECK: call noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP]])
 // CHECK: invoke noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP1]])
 
-// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(i64 %{{.*}}, i64 %{{.*}})
+// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(ptr %{{.*}}, ptr %{{.*}})
 // CHECK-NEXT: ret void
 
 // CHECK: landingpad { ptr, i32 }

@llvmbot
Copy link
Member

llvmbot commented Apr 9, 2025

@llvm/pr-subscribers-clang

Author: David Green (davemgreen)

Changes

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.


Full diff: https://github.com/llvm/llvm-project/pull/135064.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/AArch64.cpp (+18)
  • (added) clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp (+93)
  • (modified) clang/test/CodeGenCXX/trivial_abi.cpp (+5-8)
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index 073ca3cc82690..9dc5f824254eb 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -485,6 +485,24 @@ ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty, bool IsVariadicFn,
     }
     Size = llvm::alignTo(Size, Alignment);
 
+    // If the Aggregate is made up of pointers, use an array of pointers for the
+    // coerced type. This prevents having to convert ptr2int->int2ptr through
+    // the call, allowing alias analysis to produce better code.
+    if (const RecordType *RT = Ty->getAs<RecordType>()) {
+      if (const RecordDecl *RD = RT->getDecl()) {
+        if (all_of(RD->fields(), [](FieldDecl *FD) {
+              return FD->getType()->isPointerOrReferenceType();
+            })) {
+          assert((Size == 64 || Size == 128) &&
+                 "Expected a 64 or 128bit struct containing pointers");
+          llvm::Type *PtrTy = llvm::PointerType::getUnqual(getVMContext());
+          if (Size == 128)
+            PtrTy = llvm::ArrayType::get(PtrTy, 2);
+          return ABIArgInfo::getDirect(PtrTy);
+        }
+      }
+    }
+
     // We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
     // For aggregates with 16-byte alignment, we use i128.
     llvm::Type *BaseTy = llvm::Type::getIntNTy(getVMContext(), Alignment);
diff --git a/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
new file mode 100644
index 0000000000000..c2d68ae7ef6cd
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
@@ -0,0 +1,93 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple aarch64-none-elf -fcxx-exceptions -fexceptions -emit-llvm -o - %s | FileCheck %s
+
+struct Sp {
+    int *x;
+};
+// CHECK-LABEL: define dso_local void @_Z2Tp2Sp(
+// CHECK-SAME: ptr [[S_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SP:%.*]], align 8
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    store ptr [[S_COERCE]], ptr [[COERCE_DIVE]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tp(Sp s) { *s.x = 1; }
+
+struct Spp {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpp3Spp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp(Spp s) { *s.x = 1; }
+
+struct Sppp {
+    int *x, *y, *z;
+};
+// CHECK-LABEL: define dso_local void @_Z4Tppp4Sppp(
+// CHECK-SAME: ptr noundef [[S:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S_INDIRECT_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[S]], ptr [[S_INDIRECT_ADDR]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPPP:%.*]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tppp(Sppp s) { *s.x = 1; }
+
+struct Spi {
+    int *x, y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpi3Spi(
+// CHECK-SAME: [2 x i64] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPI:%.*]], align 8
+// CHECK-NEXT:    store [2 x i64] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPI]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpi(Spi s) { *s.x = 1; }
+
+struct Srp {
+    int &x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Trp3Srp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SRP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SRP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Trp(Srp s) { s.x = 1; }
+
+struct __attribute__((__packed__)) Spp_packed {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z10Tpp_packed10Spp_packed(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP_PACKED:%.*]], align 1
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 1
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP_PACKED]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 1
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp_packed(Spp_packed s) { *s.x = 1; }
diff --git a/clang/test/CodeGenCXX/trivial_abi.cpp b/clang/test/CodeGenCXX/trivial_abi.cpp
index 90054dbf37ae3..b8cc0d1cc6528 100644
--- a/clang/test/CodeGenCXX/trivial_abi.cpp
+++ b/clang/test/CodeGenCXX/trivial_abi.cpp
@@ -68,11 +68,10 @@ struct D0 : B0, B1 {
 
 Small D0::m0() { return {}; }
 
-// CHECK: define{{.*}} void @_Z14testParamSmall5Small(i64 %[[A_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z14testParamSmall5Small(ptr %[[A_COERCE:.*]])
 // CHECK: %[[A:.*]] = alloca %[[STRUCT_SMALL]], align 8
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[A]], i32 0, i32 0
-// CHECK: %[[COERCE_VAL_IP:.*]] = inttoptr i64 %[[A_COERCE]] to ptr
-// CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
+// CHECK: store ptr %[[A_COERCE]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[CALL:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[A]])
 // CHECK: ret void
 // CHECK: }
@@ -101,8 +100,7 @@ Small testReturnSmall() {
 // CHECK: %[[CALL1:.*]] = call noundef ptr @_ZN5SmallC1ERKS_(ptr {{[^,]*}} %[[AGG_TMP]], ptr noundef nonnull align 8 dereferenceable(8) %[[T]])
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: %[[CALL2:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[T]])
 // CHECK: ret void
 // CHECK: }
@@ -120,8 +118,7 @@ void testCallSmall0() {
 // CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[COERCE_DIVE1:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE1]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: ret void
 // CHECK: }
 
@@ -226,7 +223,7 @@ NonTrivial testReturnHasNonTrivial() {
 // CHECK: call noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP]])
 // CHECK: invoke noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP1]])
 
-// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(i64 %{{.*}}, i64 %{{.*}})
+// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(ptr %{{.*}}, ptr %{{.*}})
 // CHECK-NEXT: ret void
 
 // CHECK: landingpad { ptr, i32 }

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea here makes sense.

@davemgreen davemgreen force-pushed the gh-a64-coerceptr branch 3 times, most recently from 84e87d5 to 9f23881 Compare April 29, 2025 08:39
@davemgreen
Copy link
Collaborator Author

Rebase and ping - thanks.

return false;
const RecordDecl *RD = RT->getDecl();
if (const CXXRecordDecl *CXXRD = dyn_cast<CXXRecordDecl>(RD)) {
for (const auto &I : CXXRD->bases())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little uncomfortable with making a clone of isHomogeneousAggregate. I don't see any issues with this right now, but the invariants here are pretty subtle, and even if I'm not missing anything right now, something could change.

Not sure what a better solution looks like, though; the current patch might be good enough.


Refactoring isHomogeneousAggregate itself to be reusable might work, although it's sort of complicated, and it has limitations.

The x86 backend has similar code to do type inference. But it's currently known to be slightly broken; see #76017.

Maybe we want some sort of generic "lowering" function that takes a record, and just marks each as either padding, or a suitable primitive type; then targets could process that? Seems a little complicated to design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really an expert in clang but I was hoping this would be simpler. I've tried to rewrite the check to be more clear that it is just replacing i64 with ptr where we know the structure contains only pointers. AFAIU even if we get it 'wrong' the ABI remains the same (weirdness around ptrtoint/inttoptr aside). But the idea is this is limited to ptrs.

davemgreen added 3 commits May 7, 2025 09:10
The aim here is to avoid a ptrtoint->inttoptr round-trip throught the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants