RFC: First approach to add target specific intriniscs for gfx90a targets #1796

sbalint98 · 2025-05-02T15:29:34Z

This MR is an initial experimental approach to add target specific sscp builtins. In particular the hip unsafe atomics is exposed through the hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add function. It could be used by calling into AdaptiveCpp details the following way:

q.parallel_for(a.size(), [=](sycl::id<1> idx){
    constexpr auto global_adress_space = hipsycl::sycl::access::address_space::global_space;
    constexpr auto global_memory_scope = hipsycl::sycl::memory_scope::device;
    hipsycl::sycl::detail::__acpp_unsafe_atomic_fetch_add<global_adress_space>(&dev_a[0], 4.5f, relaxed_memory_order, global_memory_scope);
  });

illuhad · 2025-05-05T15:34:22Z

Why do we need a new bitcode file? Could we just not implement the unsafe atomic add in the existing one with some JIT reflection?

sbalint98 · 2025-05-06T09:58:38Z

Unfortunately, clang will choke on these builtins if there is no appropriate -mcpu specified.

Compiling it without specifying gfx90a target arch results in the following error:

/home/soproni/Projects/AdaptiveCpp/src/libkernel/sscp/amdgpu/atomic_gfx90a.cpp:18:10: error: '__builtin_amdgcn_global_atomic_fadd_f64' needs target feature gfx90a-insts
   18 |   return __builtin_amdgcn_global_atomic_fadd_f64(ptr, x);
      |

Which is as far as I can tell is due to checking the sub-target compatibility of the builtin by the frontend here:
https://github.com/intel/llvm/blob/sycl/clang/lib/CodeGen/CodeGenFunction.cpp#L3190

There are some exceptions made for compiling when targeting --hipstdpar however passing this when compiling the bitcode library results in an error about amdgcn not being a valid target for host compilation. Even after adding -Xclang -fcuda-is-device to the the compilation arguments the same issue persists. I am not sure what --hipstdpar changes that results in this new behavior. Do you think it make sense to dig further into the LLVM source to figure out if we could trick the fronted to emitting IR for these builtins?

Additionally the --hipstdpar flag has been only merged into llvm 18.1

RFC: First version to add target specific intriniscs for gfx90a targets

91723ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: First approach to add target specific intriniscs for gfx90a targets #1796

RFC: First approach to add target specific intriniscs for gfx90a targets #1796

sbalint98 commented May 2, 2025

illuhad commented May 5, 2025

sbalint98 commented May 6, 2025

RFC: First approach to add target specific intriniscs for gfx90a targets #1796

Are you sure you want to change the base?

RFC: First approach to add target specific intriniscs for gfx90a targets #1796

Conversation

sbalint98 commented May 2, 2025

illuhad commented May 5, 2025

sbalint98 commented May 6, 2025