-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[CUDA][HIP] Fix host/device attribute of builtin #138162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an error in CUDA/HIP compilation by ensuring that overloaded builtin functions are marked as implicit, allowing them to be callable on both host and device sides.
- Added a call to set the implicit attribute on the overload declaration
- Ensures that device functions can properly call the overloaded builtins
Files not reviewed (1)
- clang/test/SemaCUDA/overloaded-builtin.cu: Language not supported
Comments suppressed due to low confidence (1)
clang/lib/Sema/SemaExpr.cpp:6361
- Marking the overloaded builtin function as implicit fixes device function call errors. Please ensure that this change does not interfere with any subsequent attribute processing and document the rationale if necessary.
OverloadDecl->setImplicit(true);
@llvm/pr-subscribers-clang Author: Yaxun (Sam) Liu (yxsamliu) ChangesWhen a builtin function with generic pointer parameter is passed a pointer with address space, clang creates an overloaded builtin function but does not make it implicit. This causes error when the builtin is called by device functions since CUDA/HIP relies on the implicit attribute to treat a builtin function as callable on both host and device sides. Fixed by making the created overloaded builtin functions implicit. Full diff: https://github.com/llvm/llvm-project/pull/138162.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 0cd86dc54b0ab..d9eccb31e6d1e 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6358,6 +6358,7 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext &Context,
}
OverloadDecl->setParams(Params);
Sema->mergeDeclAttributes(OverloadDecl, FDecl);
+ OverloadDecl->setImplicit(true);
return OverloadDecl;
}
diff --git a/clang/test/SemaCUDA/overloaded-builtin.cu b/clang/test/SemaCUDA/overloaded-builtin.cu
new file mode 100644
index 0000000000000..719bfea4aef2f
--- /dev/null
+++ b/clang/test/SemaCUDA/overloaded-builtin.cu
@@ -0,0 +1,23 @@
+// expected-no-diagnostics
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -aux-triple amdgcn-amd-amdhsa -fsyntax-only -verify -xhip %s
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsyntax-only -fcuda-is-device -verify -xhip %s
+
+#include "Inputs/cuda.h"
+
+__global__ void kernel() {
+ __attribute__((address_space(0))) void *mem_ptr;
+ (void)__builtin_amdgcn_is_shared(mem_ptr);
+}
+
+template<typename T>
+__global__ void template_kernel(T *p) {
+ __attribute__((address_space(0))) void *mem_ptr;
+ (void)__builtin_amdgcn_is_shared(mem_ptr);
+}
+
+int main() {
+ int *p;
+ kernel<<<1,1>>>();
+ template_kernel<<<1,1>>>(p);
+}
|
When a builtin function with generic pointer parameter is passed a pointer with address space, clang creates an overloaded builtin function but does copy the host/device attribute. This causes error when the builtin is called by device functions since CUDA/HIP relies on the host/device attribute to treat a builtin function as callable on both host and device sides. Fixed by copying the host/device attribute of the original builtin function to the created overloaded builtin function.
When a builtin function is passed a pointer with a different
address space, clang creates an overloaded
builtin function but does not copy the host/device attribute. This causes
error when the builtin is called by device functions
since CUDA/HIP relies on the host/device attribute to treat
a builtin function as callable on both host and device
sides.
Fixed by copying the host/device attribute of the original
builtin function to the created overloaded builtin function.