Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() #136133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jmmartinez
Copy link
Contributor

@jmmartinez jmmartinez commented Apr 17, 2025

libstdc++ 15 uses the non-constexpr function
std::__glibcxx_assert_fail() to trigger compilation errors when the __glibcxx_assert(cond) macro is used in a constantly evaluated context.

Compilation fails when using code from the libstdc++ (such as std::array) on device code, since these assertions invoke a non-constexpr host function from device code.

This patch proposes a cuda wrapper header "bits/c++config.h" which adds a device version of std::__glibcxx_assert_fail().

Solves SWDEV-518041

@jmmartinez jmmartinez requested review from Artem-B and yxsamliu April 17, 2025 12:13
@jmmartinez jmmartinez self-assigned this Apr 17, 2025
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Apr 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 17, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Juan Manuel Martinez Caamaño (jmmartinez)

Changes

libstdc++ 15 uses the non-constexpr function
std::__glibcxx_assert_fail() to trigger compilation errors when the __glibcxx_assert(cond) macro is used in a constantly evaluated context.

Compilation fails when using code from the libstdc++ (such as std::array) on device code, since these assertions invoke a non-constexpr host function from device code.

This patch proposes a cuda wrapper header "bits/c++config.h" which adds a device version of std::__glibcxx_assert_fail().


Full diff: https://github.com/llvm/llvm-project/pull/136133.diff

2 Files Affected:

  • (modified) clang/lib/Headers/CMakeLists.txt (+1)
  • (added) clang/lib/Headers/cuda_wrappers/bits/c++config.h (+39)
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index acf49e40c447e..54395e053dbc4 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -333,6 +333,7 @@ set(cuda_wrapper_files
 )
 
 set(cuda_wrapper_bits_files
+  cuda_wrappers/bits/c++config.h
   cuda_wrappers/bits/shared_ptr_base.h
   cuda_wrappers/bits/basic_string.h
   cuda_wrappers/bits/basic_string.tcc
diff --git a/clang/lib/Headers/cuda_wrappers/bits/c++config.h b/clang/lib/Headers/cuda_wrappers/bits/c++config.h
new file mode 100644
index 0000000000000..583e595f7f529
--- /dev/null
+++ b/clang/lib/Headers/cuda_wrappers/bits/c++config.h
@@ -0,0 +1,39 @@
+// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail()
+// to trigger compilation errors when the __glibcxx_assert(cond) macro
+// is used in a constexpr context.
+// Compilation fails when using code from the libstdc++ (such as std::array) on
+// device code, since these assertions invoke a non-constexpr host function from
+// device code.
+//
+// To work around this issue, we declare our own device version of the function
+
+#ifndef __CLANG_CUDA_WRAPPERS_BITS_CPP_CONFIG
+#define __CLANG_CUDA_WRAPPERS_BITS_CPP_CONFIG
+
+#include_next <bits/c++config.h>
+
+#if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED
+
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
+namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+__device__
+    __attribute__((__always_inline__, __visibility__("default"))) inline void
+    __glibcxx_assert_fail() {}
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
+
+#endif
+
+#endif

@jmmartinez
Copy link
Contributor Author

This patch is cruelly missing some tests. Is there a place for tests of this kind? I haven't found an obvious one for other headers.

@jmmartinez
Copy link
Contributor Author

@yxsamliu
Copy link
Collaborator

This patch is cruelly missing some tests. Is there a place for tests of this kind? I haven't found an obvious one for other headers.

You may consider adding a test here https://github.com/llvm/llvm-test-suite/tree/main/External/HIP.

If possible, I would be happy to see std::array etc be used in device code.

@yxsamliu
Copy link
Collaborator

LGTM. The addition of the device version of std::__glibcxx_assert_fail() seems reasonable and straightforward.

WDYT @Artem-B

#endif
__device__
__attribute__((__always_inline__, __visibility__("default"))) inline void
__glibcxx_assert_fail() {}
Copy link
Member

@Artem-B Artem-B Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we do want to fail here. We should at least call __builtin_abort() or something.

Also, __glibcxx_assert_fail() appears to have variants with multiple parameters, too:

https://github.com/gcc-mirror/gcc/blob/4cff0434e8bf6683988482a7e47f8459c06f2c05/libstdc%2B%2B-v3/src/c%2B%2B11/assert_fail.cc#L33

It looks like it's version-dependent, and we may need to provide both variants, otherwise the issue will still be there for other libstdc++ versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we do want to fail here. We should at least call __builtin_abort() or something.

This fails in fact. It's a "trick" to trigger a compilation failure on a constantly evaluated context: the code ends up calling a non-constexpr function from a constantly evaluated context when the assertion fails.

The new thing in this libstdc++ version is that even when assertions are disabled, on a constexpr context, we still have assert checks.

Also, __glibcxx_assert_fail() appears to have variants with multiple parameters, too:

https://github.com/gcc-mirror/gcc/blob/4cff0434e8bf6683988482a7e47f8459c06f2c05/libstdc%2B%2B-v3/src/c%2B%2B11/assert_fail.cc#L33

It looks like it's version-dependent, and we may need to provide both variants, otherwise the issue will still be there for other libstdc++ versions.

Thanks for pointing this out. I'll add the non-constexpr version too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails in fact. It's a "trick" to trigger a compilation failure on a constantly evaluated context: the code ends up calling a non-constexpr function from a constantly evaluated context when the assertion fails.

The new thing in this libstdc++ version is that even when assertions are disabled, on a constexpr context, we still have assert checks.

Would it be possible to create a simple reproducer on godbolt.org? I think I'm missing some bits of this puzzle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could help to understand the intention of the empty __glibcxx_assert_fail function: https://godbolt.org/z/YfY6nec1K

This snippet tries to reflect what would happen when runtime assertions are disabled; but constexpr assertions are enabled.

  • On a non-constexpr context the whole assertion code gets optimized out since std::is_constant_evaluated is always false
  • On a constexpr context, when we uncomment the constexpr from the declaration of Elt, the code ends up calling the "non_constexpr" function from a constexpr context; which raises an error.
  • If we change the index to be inbounds, the code compiles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a similar related issue (in which I think you've worked already): https://bugs.llvm.org/show_bug.cgi?id=50383

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the standard libraries were never intended to be used on a GPU. It's surprising that they work as well as they do with relatively few hacks on our side.

@jmmartinez jmmartinez force-pushed the glibcxx_assert branch 2 times, most recently from b433add to 2136a05 Compare April 29, 2025 14:48
Copy link

github-actions bot commented Apr 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

libstdc++ 15 uses the non-constexpr function
std::__glibcxx_assert_fail() to trigger compilation errors when the
__glibcxx_assert(cond) macro is used in a constantly evaluated context.

Compilation fails when using code from the libstdc++ (such as
std::array) on device code, since these assertions invoke a
non-constexpr host function from device code.

This patch proposes a cuda wrapper header "bits/c++config.h" which adds a
__device__ version of std::__glibcxx_assert_fail().
@jmmartinez
Copy link
Contributor Author

I've added a verbose version of the assertion that uses __builtin_printf (I could not figure out a way to correctly emit something to stderr from device code in a generic way without including cstdio.

#ifdef _GLIBCXX_VERBOSE_ASSERT
__attribute__((device, noreturn)) inline void
__glibcxx_assert_fail(const char *file, int line, const char *function,
                      const char *condition) noexcept {
  if (file && function && condition)
    __builtin_printf("%s:%d: %s: Assertion '%s' failed.\n", file, line,
                     function, condition);
  else if (function)
    __builtin_printf("%s: Undefined behavior detected.\n", function);
  __builtin_abort();
}
#endif

An example from out test with an out-of-bounds access to an std::array (the first line is from the assert, the rest comes from the exception raised):

...
/home/juamarti/swdev/518041/_gcc/_install/lib/gcc/x86_64-pc-linux-gnu/15.0.1/../../../../include/c++/15.0.1/array:210: reference std::array<float, 16>::operator[](size_type) [_Tp = float, _Nm = 16]: Assertion '__n < this->size()' failed.
Kernel Name: _Z15vectoradd_floatPfPKfS1_
VGPU=0x5bae1d0e41e0 SWq=0x7929b05da000, HWq=0x792897b00000, id=1                                                                                                                                                         Dispatch Header = 0xb02 (type=2, barrier=1, acquire=1, release=1), setup=0
        grid=[65536, 1, 1], workgroup=[16, 1, 1]                                                                                                                                                                         private_seg_size=1920, group_seg_size=0
        kernel_obj=0x7929aea90ac0, kernarg_address=0x0x792895d00000                                                                                                                                                      completion_signal=0x0, correlation_id=0
        rptr=4, wptr=6
:0:rocdevice.cpp            :3620: 8952765750346 us:  Callback: Queue 0x792897b00000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016
Aborted (core dumped)

@jmmartinez jmmartinez requested a review from Artem-B April 29, 2025 14:57
Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in principle.

Now the question is -- how do we test it? There are multiple libstdc++ library versions in the wild and we must not break any of them. We do have some testing on CUDA test bots (which I've just discovered to be silently broken for a while now), but they do not cover recent libstdc++ versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants