-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[Clang] Add noalias
to this
pointer in C++ constructors
#136792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
ba3d52c
to
9c7600c
Compare
This isn't really the right place to add this code; we should be doing it alongside all the other attributes on function definitions... That said:
|
Yeah, I realize it's misplaced, I am not familiar with that part of the project, see the first paragraph in the PR description. I don't really agree with your second point about breaking people's existing assumptions on UB :) I am willing to run correctness suites to further validate this change, if you can recommend on any. |
llvm-test-suite has some C++ code... you can also just try bootstrapping LLVM itself, which has a lot of unusual C++ constructs. But how do we detect if code is actually breaking the rules? Most violations wouldn't actually end up miscompiling in a visible way... even if the compiler proves noalias, it might not choose to actually rearrange the code, or it might only do it under rare conditions. We'd need some kind of runtime instrumentation to tell if code is actually following the rule. |
The relevant rule is [class.cdtor]/2:
It would be nice if this explicitly said that the behavior is undefined, rather than the value being unspecified, but I think the effect is mostly the same. (I suggested fixing this a few years ago, but WG21 wanted a paper examining the design space that I didn't have the cycles to work on at the time.) There are two cases here:
struct A {
A(int *p) {
// Cannot reorder this store past the store to `n`.
*p = 1;
n = 2;
}
int n;
};
// Must leave a.n == 2.
A a = A(&a); But... I don't know what exactly |
We don't really specify what, exactly, the consequences are for violating noalias in LangRef. But... we do say elsewhere "If memory accesses alias even though they are noalias according to !tbaa metadata, the behavior is undefined." Probably the same has to apply to noalias, or else we get weird consequences for optimization. I mean, in practice a lot of the optimizations you'd actually want to do would be fine with "load produces an indeterminate value, store stores an indeterminate value". But we don't distinguish those optimizations from the ones that actually do loudly blow up (like rematerializing a load). |
That said, in terms of whether we actually care about what the committee says... both the C and C++ committees have a terrible track record on the fine details of stuff related to memory. I'd be okay with being slightly more aggressive than the standard technically allows if we can show it's actually helpful in practice. I'd like to see performance numbers before we go too deep into the weeds, though; we should make sure this actually helps in practice before we go too deep into figuring out the exact semantics/acceptance criteria. |
Comparison between latest Clang and GCC's output for a snippet out of a benchmark that could use this optimization: https://godbolt.org/z/35EEvcsPr. I've ran llvm-test-suite ten times for the before and after, it executed correctly and expectedly saw no performance gains in run time:
But we get a slight reduction in code size (keep in mind that most tests are irrelevant because this patch applies to C++):
Most notable in:
Also built a bootstrap build and confirmed that |
9c7600c
to
d8ad192
Compare
@llvm/pr-subscribers-clang @llvm/pr-subscribers-hlsl Author: Guy David (guy-david) ChangesNote: the patch is probably amending the wrong piece of code, I've tried to add it to Clang does not transform the following example into a 128-bit load and store: class vector4f
{
private:
float _elements[4];
public:
explicit __attribute__((noinline)) vector4f(float const *src)
{
_elements[0] = src[0];
_elements[1] = src[1];
_elements[2] = src[2];
_elements[3] = src[3];
}
}; And instead generates 8 memory operations. That's because According to the standard in 11.10.4.2 under [class.cdtor]: which sounds like Relevant GCC chain-mail: https://gcc.gnu.org/pipermail/gcc-patches/2018-May/498812.html. Patch is 4.14 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136792.diff 132 Files Affected:
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 82a24f7c295a2..c2f5fc261955d 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2731,8 +2731,8 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
llvm::AttributeSet::get(getLLVMContext(), Attrs);
}
- // Apply `nonnull`, `dereferenceable(N)` and `align N` to the `this` argument,
- // unless this is a thunk function.
+ // Apply `nonnull`, `dereferenceable(N)`, `align N` (and `noalias` for
+ // constructors) to the `this` argument, unless this is a thunk function.
// FIXME: fix this properly, https://reviews.llvm.org/D100388
if (FI.isInstanceMethod() && !IRFunctionArgs.hasInallocaArg() &&
!FI.arg_begin()->type->isVoidPointerType() && !IsThunk) {
@@ -2744,6 +2744,11 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
QualType ThisTy = FI.arg_begin()->type.getTypePtr()->getPointeeType();
+ // According to [class.cdtor]/2, the value of the object is unspecified if
+ // its elements are accessed not through `this`.
+ if (isa_and_nonnull<CXXConstructorDecl>(TargetDecl))
+ Attrs.addAttribute(llvm::Attribute::NoAlias);
+
if (!CodeGenOpts.NullPointerIsValid &&
getTypes().getTargetAddressSpace(FI.arg_begin()->type) == 0) {
Attrs.addAttribute(llvm::Attribute::NonNull);
diff --git a/clang/test/CodeGen/attr-counted-by-pr88931.cpp b/clang/test/CodeGen/attr-counted-by-pr88931.cpp
index 6d0c46bbbe8f9..8297cdf0f120c 100644
--- a/clang/test/CodeGen/attr-counted-by-pr88931.cpp
+++ b/clang/test/CodeGen/attr-counted-by-pr88931.cpp
@@ -11,7 +11,7 @@ struct foo {
void init(void * __attribute__((pass_dynamic_object_size(0))));
// CHECK-LABEL: define dso_local void @_ZN3foo3barC1Ev(
-// CHECK-SAME: ptr noundef nonnull align 4 dereferenceable(1) [[THIS:%.*]]) unnamed_addr #[[ATTR0:[0-9]+]] align 2 {
+// CHECK-SAME: ptr noalias noundef nonnull align 4 dereferenceable(1) [[THIS:%.*]]) unnamed_addr #[[ATTR0:[0-9]+]] align 2 {
// CHECK-NEXT: entry:
// CHECK-NEXT: tail call void @_Z4initPvU25pass_dynamic_object_size0(ptr noundef nonnull align 4 dereferenceable(1) [[THIS]], i64 noundef -1) #[[ATTR2:[0-9]+]]
// CHECK-NEXT: ret void
diff --git a/clang/test/CodeGen/attr-noundef.cpp b/clang/test/CodeGen/attr-noundef.cpp
index abdf9496bd396..30c4282759144 100644
--- a/clang/test/CodeGen/attr-noundef.cpp
+++ b/clang/test/CodeGen/attr-noundef.cpp
@@ -10,157 +10,158 @@
// TODO: No structs may currently be marked noundef
namespace check_structs {
-struct Trivial {
- int a;
-};
-Trivial ret_trivial() { return {}; }
-void pass_trivial(Trivial e) {}
-// CHECK-INTEL: [[DEF:define( dso_local)?]] i32 @{{.*}}ret_trivial
-// CHECK-AARCH: [[DEF:define( dso_local)?]] i32 @{{.*}}ret_trivial
-// CHECK-INTEL: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i32 %
-// CHECK-AARCH: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i64 %
-
-struct NoCopy {
- int a;
- NoCopy(NoCopy &) = delete;
-};
-NoCopy ret_nocopy() { return {}; }
-void pass_nocopy(NoCopy e) {}
-// CHECK: [[DEF]] void @{{.*}}ret_nocopy{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
-// CHECK: [[DEF]] void @{{.*}}pass_nocopy{{.*}}(ptr noundef %
-
-struct Huge {
- int a[1024];
-};
-Huge ret_huge() { return {}; }
-void pass_huge(Huge h) {}
-// CHECK: [[DEF]] void @{{.*}}ret_huge{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
-// CHECK: [[DEF]] void @{{.*}}pass_huge{{.*}}(ptr noundef
-} // namespace check_structs
-
-//************ Passing unions by value
-// No unions may be marked noundef
-
-namespace check_unions {
-union Trivial {
- int a;
-};
-Trivial ret_trivial() { return {}; }
-void pass_trivial(Trivial e) {}
-// CHECK-INTEL: [[DEF]] i32 @{{.*}}ret_trivial
-// CHECK-AARCH: [[DEF]] i32 @{{.*}}ret_trivial
-// CHECK-INTEL: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i32 %
-// CHECK-AARCH: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i64 %
-
-union NoCopy {
- int a;
- NoCopy(NoCopy &) = delete;
-};
-NoCopy ret_nocopy() { return {}; }
-void pass_nocopy(NoCopy e) {}
-// CHECK: [[DEF]] void @{{.*}}ret_nocopy{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
-// CHECK: [[DEF]] void @{{.*}}pass_nocopy{{.*}}(ptr noundef %
-} // namespace check_unions
-
-//************ Passing `this` pointers
-// `this` pointer must always be defined
-
-namespace check_this {
-struct Object {
- int data[];
-
- Object() {
- this->data[0] = 0;
+ struct Trivial {
+ int a;
+ };
+ Trivial ret_trivial() { return {}; }
+ void pass_trivial(Trivial e) {}
+ // CHECK-INTEL: [[DEF:define( dso_local)?]] i32 @{{.*}}ret_trivial
+ // CHECK-AARCH: [[DEF:define( dso_local)?]] i32 @{{.*}}ret_trivial
+ // CHECK-INTEL: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i32 %
+ // CHECK-AARCH: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i64 %
+
+ struct NoCopy {
+ int a;
+ NoCopy(NoCopy &) = delete;
+ };
+ NoCopy ret_nocopy() { return {}; }
+ void pass_nocopy(NoCopy e) {}
+ // CHECK: [[DEF]] void @{{.*}}ret_nocopy{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
+ // CHECK: [[DEF]] void @{{.*}}pass_nocopy{{.*}}(ptr noundef %
+
+ struct Huge {
+ int a[1024];
+ };
+ Huge ret_huge() { return {}; }
+ void pass_huge(Huge h) {}
+ // CHECK: [[DEF]] void @{{.*}}ret_huge{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
+ // CHECK: [[DEF]] void @{{.*}}pass_huge{{.*}}(ptr noundef
+ } // namespace check_structs
+
+ //************ Passing unions by value
+ // No unions may be marked noundef
+
+ namespace check_unions {
+ union Trivial {
+ int a;
+ };
+ Trivial ret_trivial() { return {}; }
+ void pass_trivial(Trivial e) {}
+ // CHECK-INTEL: [[DEF]] i32 @{{.*}}ret_trivial
+ // CHECK-AARCH: [[DEF]] i32 @{{.*}}ret_trivial
+ // CHECK-INTEL: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i32 %
+ // CHECK-AARCH: [[DEF]] void @{{.*}}pass_trivial{{.*}}(i64 %
+
+ union NoCopy {
+ int a;
+ NoCopy(NoCopy &) = delete;
+ };
+ NoCopy ret_nocopy() { return {}; }
+ void pass_nocopy(NoCopy e) {}
+ // CHECK: [[DEF]] void @{{.*}}ret_nocopy{{.*}}(ptr dead_on_unwind noalias writable sret({{[^)]+}}) align 4 %
+ // CHECK: [[DEF]] void @{{.*}}pass_nocopy{{.*}}(ptr noundef %
+ } // namespace check_unions
+
+ //************ Passing `this` pointers
+ // `this` pointer must always be defined
+
+ namespace check_this {
+ struct Object {
+ int data[];
+
+ Object() {
+ this->data[0] = 0;
+ }
+ int getData() {
+ return this->data[0];
+ }
+ Object *getThis() {
+ return this;
+ }
+ };
+
+ void use_object() {
+ Object obj;
+ obj.getData();
+ obj.getThis();
}
- int getData() {
- return this->data[0];
+ // CHECK: define linkonce_odr void @{{.*}}Object{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(1) %
+ // CHECK: define linkonce_odr noundef i32 @{{.*}}Object{{.*}}getData{{.*}}(ptr noundef nonnull align 4 dereferenceable(1) %
+ // CHECK: define linkonce_odr noundef ptr @{{.*}}Object{{.*}}getThis{{.*}}(ptr noundef nonnull align 4 dereferenceable(1) %
+ } // namespace check_this
+
+ //************ Passing vector types
+
+ namespace check_vecs {
+ typedef int __attribute__((vector_size(12))) i32x3;
+ i32x3 ret_vec() {
+ return {};
}
- Object *getThis() {
- return this;
+ void pass_vec(i32x3 v) {
}
-};
-
-void use_object() {
- Object obj;
- obj.getData();
- obj.getThis();
-}
-// CHECK: define linkonce_odr void @{{.*}}Object{{.*}}(ptr noundef nonnull align 4 dereferenceable(1) %
-// CHECK: define linkonce_odr noundef i32 @{{.*}}Object{{.*}}getData{{.*}}(ptr noundef nonnull align 4 dereferenceable(1) %
-// CHECK: define linkonce_odr noundef ptr @{{.*}}Object{{.*}}getThis{{.*}}(ptr noundef nonnull align 4 dereferenceable(1) %
-} // namespace check_this
-
-//************ Passing vector types
-
-namespace check_vecs {
-typedef int __attribute__((vector_size(12))) i32x3;
-i32x3 ret_vec() {
- return {};
-}
-void pass_vec(i32x3 v) {
-}
-
-// CHECK: [[DEF]] noundef <3 x i32> @{{.*}}ret_vec{{.*}}()
-// CHECK-INTEL: [[DEF]] void @{{.*}}pass_vec{{.*}}(<3 x i32> noundef %
-// CHECK-AARCH: [[DEF]] void @{{.*}}pass_vec{{.*}}(<4 x i32> %
-} // namespace check_vecs
-
-//************ Passing exotic types
-// Function/Array pointers, Function member / Data member pointers, nullptr_t, ExtInt types
-
-namespace check_exotic {
-struct Object {
- int mfunc();
- int mdata;
-};
-typedef int Object::*mdptr;
-typedef int (Object::*mfptr)();
-typedef decltype(nullptr) nullptr_t;
-typedef int (*arrptr)[32];
-typedef int (*fnptr)(int);
-
-arrptr ret_arrptr() {
- return nullptr;
-}
-fnptr ret_fnptr() {
- return nullptr;
-}
-mdptr ret_mdptr() {
- return nullptr;
-}
-mfptr ret_mfptr() {
- return nullptr;
-}
-nullptr_t ret_npt() {
- return nullptr;
-}
-void pass_npt(nullptr_t t) {
-}
-_BitInt(3) ret_BitInt() {
- return 0;
-}
-void pass_BitInt(_BitInt(3) e) {
-}
-void pass_large_BitInt(_BitInt(127) e) {
-}
-
-// Pointers to arrays/functions are always noundef
-// CHECK: [[DEF]] noundef ptr @{{.*}}ret_arrptr{{.*}}()
-// CHECK: [[DEF]] noundef ptr @{{.*}}ret_fnptr{{.*}}()
-
-// Pointers to members are never noundef
-// CHECK: [[DEF]] i64 @{{.*}}ret_mdptr{{.*}}()
-// CHECK-INTEL: [[DEF]] { i64, i64 } @{{.*}}ret_mfptr{{.*}}()
-// CHECK-AARCH: [[DEF]] [2 x i64] @{{.*}}ret_mfptr{{.*}}()
-
-// nullptr_t is never noundef
-// CHECK: [[DEF]] ptr @{{.*}}ret_npt{{.*}}()
-// CHECK: [[DEF]] void @{{.*}}pass_npt{{.*}}(ptr %
-
-// CHECK-INTEL: [[DEF]] noundef signext i3 @{{.*}}ret_BitInt{{.*}}()
-// CHECK-AARCH: [[DEF]] noundef i3 @{{.*}}ret_BitInt{{.*}}()
-// CHECK-INTEL: [[DEF]] void @{{.*}}pass_BitInt{{.*}}(i3 noundef signext %
-// CHECK-AARCH: [[DEF]] void @{{.*}}pass_BitInt{{.*}}(i3 noundef %
-// CHECK-INTEL: [[DEF]] void @{{.*}}pass_large_BitInt{{.*}}(i64 noundef %{{.*}}, i64 noundef %
-// CHECK-AARCH: [[DEF]] void @{{.*}}pass_large_BitInt{{.*}}(i127 noundef %
-} // namespace check_exotic
+
+ // CHECK: [[DEF]] noundef <3 x i32> @{{.*}}ret_vec{{.*}}()
+ // CHECK-INTEL: [[DEF]] void @{{.*}}pass_vec{{.*}}(<3 x i32> noundef %
+ // CHECK-AARCH: [[DEF]] void @{{.*}}pass_vec{{.*}}(<4 x i32> %
+ } // namespace check_vecs
+
+ //************ Passing exotic types
+ // Function/Array pointers, Function member / Data member pointers, nullptr_t, ExtInt types
+
+ namespace check_exotic {
+ struct Object {
+ int mfunc();
+ int mdata;
+ };
+ typedef int Object::*mdptr;
+ typedef int (Object::*mfptr)();
+ typedef decltype(nullptr) nullptr_t;
+ typedef int (*arrptr)[32];
+ typedef int (*fnptr)(int);
+
+ arrptr ret_arrptr() {
+ return nullptr;
+ }
+ fnptr ret_fnptr() {
+ return nullptr;
+ }
+ mdptr ret_mdptr() {
+ return nullptr;
+ }
+ mfptr ret_mfptr() {
+ return nullptr;
+ }
+ nullptr_t ret_npt() {
+ return nullptr;
+ }
+ void pass_npt(nullptr_t t) {
+ }
+ _BitInt(3) ret_BitInt() {
+ return 0;
+ }
+ void pass_BitInt(_BitInt(3) e) {
+ }
+ void pass_large_BitInt(_BitInt(127) e) {
+ }
+
+ // Pointers to arrays/functions are always noundef
+ // CHECK: [[DEF]] noundef ptr @{{.*}}ret_arrptr{{.*}}()
+ // CHECK: [[DEF]] noundef ptr @{{.*}}ret_fnptr{{.*}}()
+
+ // Pointers to members are never noundef
+ // CHECK: [[DEF]] i64 @{{.*}}ret_mdptr{{.*}}()
+ // CHECK-INTEL: [[DEF]] { i64, i64 } @{{.*}}ret_mfptr{{.*}}()
+ // CHECK-AARCH: [[DEF]] [2 x i64] @{{.*}}ret_mfptr{{.*}}()
+
+ // nullptr_t is never noundef
+ // CHECK: [[DEF]] ptr @{{.*}}ret_npt{{.*}}()
+ // CHECK: [[DEF]] void @{{.*}}pass_npt{{.*}}(ptr %
+
+ // CHECK-INTEL: [[DEF]] noundef signext i3 @{{.*}}ret_BitInt{{.*}}()
+ // CHECK-AARCH: [[DEF]] noundef i3 @{{.*}}ret_BitInt{{.*}}()
+ // CHECK-INTEL: [[DEF]] void @{{.*}}pass_BitInt{{.*}}(i3 noundef signext %
+ // CHECK-AARCH: [[DEF]] void @{{.*}}pass_BitInt{{.*}}(i3 noundef %
+ // CHECK-INTEL: [[DEF]] void @{{.*}}pass_large_BitInt{{.*}}(i64 noundef %{{.*}}, i64 noundef %
+ // CHECK-AARCH: [[DEF]] void @{{.*}}pass_large_BitInt{{.*}}(i127 noundef %
+ } // namespace check_exotic
+
\ No newline at end of file
diff --git a/clang/test/CodeGen/paren-list-agg-init.cpp b/clang/test/CodeGen/paren-list-agg-init.cpp
index 235352382332a..e674a3492612e 100644
--- a/clang/test/CodeGen/paren-list-agg-init.cpp
+++ b/clang/test/CodeGen/paren-list-agg-init.cpp
@@ -390,9 +390,9 @@ namespace gh61145 {
// CHECK-NEXT: [[V:%.*v.*]] = alloca [[STRUCT_VEC]], align 1
// CHECK-NEXT: [[AGG_TMP_ENSURED:%.*agg.tmp.ensured.*]] = alloca [[STRUCT_S1]], align 1
// a.k.a. Vec::Vec()
- // CHECK-NEXT: call void @_ZN7gh611453VecC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[V]])
+ // CHECK-NEXT: call void @_ZN7gh611453VecC1Ev(ptr noalias noundef nonnull align 1 dereferenceable(1) [[V]])
// a.k.a. Vec::Vec(Vec&&)
- // CHECK-NEXT: call void @_ZN7gh611453VecC1EOS0_(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]], ptr noundef nonnull align 1 dereferenceable(1) [[V]])
+ // CHECK-NEXT: call void @_ZN7gh611453VecC1EOS0_(ptr noalias noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]], ptr noundef nonnull align 1 dereferenceable(1) [[V]])
// a.k.a. S1::~S1()
// CHECK-NEXT: call void @_ZN7gh611452S1D1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]])
// a.k.a.Vec::~Vec()
@@ -410,9 +410,9 @@ namespace gh61145 {
// CHECK-NEXT: [[V:%.*v.*]] = alloca [[STRUCT_VEC]], align 1
// CHECK-NEXT: [[AGG_TMP_ENSURED:%.*agg.tmp.ensured.*]] = alloca [[STRUCT_S2]], align 1
// a.k.a. Vec::Vec()
- // CHECK-NEXT: call void @_ZN7gh611453VecC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[V]])
+ // CHECK-NEXT: call void @_ZN7gh611453VecC1Ev(ptr noalias noundef nonnull align 1 dereferenceable(1) [[V]])
// a.k.a. Vec::Vec(Vec&&)
- // CHECK-NEXT: call void @_ZN7gh611453VecC1EOS0_(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]], ptr noundef nonnull align 1 dereferenceable(1) [[V]])
+ // CHECK-NEXT: call void @_ZN7gh611453VecC1EOS0_(ptr noalias noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]], ptr noundef nonnull align 1 dereferenceable(1) [[V]])
// CHECK-NEXT: [[C:%.*c.*]] = getelementptr inbounds nuw [[STRUCT_S2]], ptr [[AGG_TMP_ENSURED]], i32 0, i32
// CHECK-NEXT: store i8 0, ptr [[C]], align 1
// a.k.a. S2::~S2()
diff --git a/clang/test/CodeGen/temporary-lifetime.cpp b/clang/test/CodeGen/temporary-lifetime.cpp
index 9f085d41d1464..3c2715c5a3dfe 100644
--- a/clang/test/CodeGen/temporary-lifetime.cpp
+++ b/clang/test/CodeGen/temporary-lifetime.cpp
@@ -22,12 +22,12 @@ T Baz();
void Test1() {
// CHECK-DTOR-LABEL: Test1
// CHECK-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR:.+]])
- // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]])
+ // CHECK-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR:[^ ]+]])
// CHECK-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR]])
// CHECK-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR]])
// CHECK-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR:.+]])
- // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]])
+ // CHECK-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR:[^ ]+]])
// CHECK-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR]])
// CHECK-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR]])
@@ -35,11 +35,11 @@ void Test1() {
// CHECK-NO-DTOR-LABEL: Test1
// CHECK-NO-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR:.+]])
- // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]])
+ // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR:[^ ]+]])
// CHECK-NO-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-NO-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR]])
// CHECK-NO-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR:.+]])
- // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]])
+ // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR:[^ ]+]])
// CHECK-NO-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-NO-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR]])
// CHECK-NO-DTOR: }
@@ -56,10 +56,10 @@ void Test1() {
void Test2() {
// CHECK-DTOR-LABEL: Test2
// CHECK-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR1:.+]])
- // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR1:[^ ]+]])
+ // CHECK-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR1:[^ ]+]])
// CHECK-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR2:.+]])
- // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR2:[^ ]+]])
+ // CHECK-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR2:[^ ]+]])
// CHECK-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR2]])
// CHECK-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR2]])
@@ -69,10 +69,10 @@ void Test2() {
// CHECK-NO-DTOR-LABEL: Test2
// CHECK-NO-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR1:.+]])
- // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR1:[^ ]+]])
+ // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR1:[^ ]+]])
// CHECK-NO-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-NO-DTOR: call void @llvm.lifetime.start.p0(i64 1024, ptr nonnull %[[ADDR2:.+]])
- // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR2:[^ ]+]])
+ // CHECK-NO-DTOR: call void @_ZN1AC1Ev(ptr noalias nonnull {{[^,]*}} %[[VAR2:[^ ]+]])
// CHECK-NO-DTOR: call void @_Z3FooIRK1AEvOT_
// CHECK-NO-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR2]])
// CHECK-NO-DTOR: call void @llvm.lifetime.end.p0(i64 1024, ptr nonnull %[[ADDR1]])
diff --git a/clang/test/CodeGenCUDA/offload_via_llvm.cu b/clang/test/CodeGenCUDA/offload_via_llvm.cu
index 62942d8dc0755..860b036ec1b9c 100644
--- a/clang/test/CodeGenCUDA/offload_via_llvm.cu
+++ b/clang/test/CodeGenCUDA/offload_via_llvm.cu
@@ -45,7 +45,7 @@
// HST-NEXT: [[TMP15:%.*]] = call i32 @__llvmPopCallConfiguration(ptr [[GRID_DIM]], ptr [[BLOCK_DIM]], ptr [[SHMEM_SIZE]], ptr [[STREAM]])
// HST-NEXT: [[TMP16:%.*]] = load i32, ptr [[SHMEM_SIZE]], align 4
// HST-NEXT: [[TMP17:%.*]] = load ptr, ptr [[STREAM]], align 4
-// HST-NEXT: [[CALL:%.*]] = call noundef i32 @llvmLaunchKernel(ptr noundef @_Z18__device_stub__fooisPvS_, ptr noundef byval([[STRUCT_DIM3]]) align 4 [[GRID_DIM]], ptr noundef byval([[STRUCT_DIM3]]) align 4 [[BLOCK_DIM]], ptr noundef [[KERNEL_LAUNCH_PARAMS]], i32 noundef [[TMP16]], ptr noundef [[TMP17]])
+// HST-NEXT: [[CALL:%.*]] = call noundef i32 @llvmLaunchKernel(ptr noundef @_Z18__device_stub__fooisPvS_, ptr noundef byval([[STRUCT_DIM3]]) align 4 [[GRID_DIM]], ptr noundef byval([[STRUCT_DIM3]]) align 4 [[BLOCK_DIM]], ptr noundef [[KERNEL_LAUNCH_PARAMS]], i32 noundef [[TMP16]], ptr noundef [[TMP17]]) #[[ATTR3:[0-9]+]]
// HST-NEXT: br label %[[SETUP_END:.*]]
// HST: [[SETUP_END]]:
// HST-NEXT: ret void
@@ -72,15 +72,15 @@ __global__ void foo(int, short, void *, void *) {}
// HST-NEXT: [[AGG_TMP:%.*]] = alloca [[STRUCT_DIM3:%.*]], align 4
// HST-NEXT: [[AGG_TMP1:%.*]] = alloca [[STRUCT_DIM3]], align 4
// HST-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 4
-// HST-NEXT: call void @_ZN4dim3C1Ejjj(ptr noundef nonnull align 4 dereferenceable(12) [[AGG_TMP]], i32 noundef 3, i32 noundef 1, i32 noundef 1)
-// HST-NEXT: call void @_ZN4dim3C1Ejjj(ptr noundef nonnull align 4 dereferenceable(12) [[AGG_TMP1]], i32 noundef 7, i32 noundef 1, i32 noundef 1)
-// HST-NEXT: [[CALL:%.*]] = call i32 @__llvmPushCallConfiguration(ptr noundef byval([[STRUCT_DIM3]]) align 4 [[AGG_TMP]], ptr noundef byval([[STRUCT_DIM3]]) align 4 [[AGG_TMP1]], i32 noundef 0, ptr noundef null)
+// HST-NEXT: call void @_ZN4dim3C1Ejjj(ptr noalias noundef nonnull align 4 dereferenceable(12) [[AGG_TMP]], i32 noundef 3, i32 noundef 1, i32 noundef 1) #[[ATTR3]]
+// HST-NEXT: call void @_ZN4dim3C1Ejjj(ptr noalias noundef nonnull align 4 dereferenceable(12) [[AGG_TMP...
[truncated]
|
Clang does not transform the following example into a 128-bit load and store:
And instead generates 8 memory operations. That's because
src
might overlap with_elements
. However, GCC is able to optimize it for constructors only.According to the standard in 11.10.4.2 under [class.cdtor]:
which sounds like
restrict
.Relevant GCC chain-mail: https://gcc.gnu.org/pipermail/gcc-patches/2018-May/498812.html.