Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[HLSL] Run finalize linkage pass for all targets #134260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

s-perron
Copy link
Contributor

@s-perron s-perron commented Apr 3, 2025

HLSL has three levels of visibility for functions. See section 3.6 of
the HLSL spec
for details.

  1. Functions marked static have internal linkage. These function are
    marked as internal in clang.
  2. Function marked as export have program linkage. Clang
    marks these as external, and adds the hlsl_export attribute.
  3. Function that are not qualified in any way have external linkage.
    Clang marks these as external without adding the hlsl_export
    attribute.

Linking translation units into programs should happen before entering
the backend. Individual backends should not be concerned with HLSL
specific linking rules.

If that is correct, then the linkage of the functions should be modified
so that backends do not need to be aware of hlsl_export.

The DXILFinalizeLinkage pass in the DirectX backend does this already.
This PR moves the DXILFinalizeLinkage to Clang and runs it for all backends.
The pass is also renamed to HLSLFinalizeLinkage.

HLSL has three levels of visibility for functions. See section 3.6 of
the [HLSL spec](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
for details.

1. Functions marked `static` have internal linkage. These function are
   marked as internal in clang.
2. Function marked as `export` have program linkage. Clang
   marks these as external, and adds the `hlsl_export` attribute.
3. Function that are not qualified in any way have external linkage.
   Clang marks these as external without adding the `hlsl_export`
attribute.

Linking translation units into programs should happen before entering
the backend. Individual backends should not be concerned with HLSL
specific linking rules.

If that is correct, then the linkage of the functions should be modified
so that backends do not need to be aware of `hlsl_export`.

The DXILFinalizeLinkage pass in the DirectX backend does this already.
This PR moves the DXILFinalizeLinkage to Clang and runs it for all backends.
The pass is also renamed to HLSLFinalizeLinkage.
@s-perron s-perron requested a review from Keenuts April 3, 2025 15:27
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. backend:DirectX HLSL HLSL Language Support llvm:transforms labels Apr 3, 2025
@s-perron s-perron requested review from bogner and llvm-beanz April 3, 2025 15:27
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-backend-directx

Author: Steven Perron (s-perron)

Changes

HLSL has three levels of visibility for functions. See section 3.6 of
the HLSL spec
for details.

  1. Functions marked static have internal linkage. These function are
    marked as internal in clang.
  2. Function marked as export have program linkage. Clang
    marks these as external, and adds the hlsl_export attribute.
  3. Function that are not qualified in any way have external linkage.
    Clang marks these as external without adding the hlsl_export
    attribute.

Linking translation units into programs should happen before entering
the backend. Individual backends should not be concerned with HLSL
specific linking rules.

If that is correct, then the linkage of the functions should be modified
so that backends do not need to be aware of hlsl_export.

The DXILFinalizeLinkage pass in the DirectX backend does this already.
This PR moves the DXILFinalizeLinkage to Clang and runs it for all backends.
The pass is also renamed to HLSLFinalizeLinkage.


Patch is 47.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134260.diff

24 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+6)
  • (modified) clang/lib/CodeGen/CMakeLists.txt (+1)
  • (modified) clang/test/CodeGenHLSL/builtins/D3DCOLORtoUBYTE4.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/and.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/asfloat.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/asint.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/asint16.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/asuint.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/asuint16.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/clip.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/builtins/distance.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/fmod.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/length.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/reflect.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/smoothstep.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/splitdouble.hlsl (+5-5)
  • (modified) clang/test/CodeGenHLSL/inline-functions.hlsl (+1-1)
  • (renamed) llvm/include/llvm/Transforms/HLSL/HLSLFinalizeLinkage.h (+10-10)
  • (modified) llvm/lib/Target/DirectX/CMakeLists.txt (-1)
  • (modified) llvm/lib/Target/DirectX/DirectXTargetMachine.cpp (-2)
  • (modified) llvm/lib/Transforms/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/HLSL/CMakeLists.txt (+18)
  • (renamed) llvm/lib/Transforms/HLSL/HLSLFinalizeLinkage.cpp (+4-21)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..4967c1f11dd2e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -56,6 +56,7 @@
 #include "llvm/Target/TargetOptions.h"
 #include "llvm/TargetParser/SubtargetFeature.h"
 #include "llvm/TargetParser/Triple.h"
+#include "llvm/Transforms/HLSL/HLSLFinalizeLinkage.h"
 #include "llvm/Transforms/HipStdPar/HipStdPar.h"
 #include "llvm/Transforms/IPO/EmbedBitcodePass.h"
 #include "llvm/Transforms/IPO/LowerTypeTests.h"
@@ -1115,6 +1116,11 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   if (CodeGenOpts.LinkBitcodePostopt)
     MPM.addPass(LinkInModulesPass(BC));
 
+  if (LangOpts.HLSL && !CodeGenOpts.DisableLLVMPasses) {
+    // Passes required by HLSL for every backend.
+    MPM.addPass(HLSLFinalizeLinkage());
+  }
+
   // Add a verifier pass if requested. We don't have to do this if the action
   // requires code generation because there will already be a verifier pass in
   // the code-generation pipeline.
diff --git a/clang/lib/CodeGen/CMakeLists.txt b/clang/lib/CodeGen/CMakeLists.txt
index ebe2fbd7db295..65a23291029b4 100644
--- a/clang/lib/CodeGen/CMakeLists.txt
+++ b/clang/lib/CodeGen/CMakeLists.txt
@@ -14,6 +14,7 @@ set(LLVM_LINK_COMPONENTS
   FrontendOpenMP
   FrontendOffloading
   HIPStdPar
+  HLSL
   IPO
   IRPrinter
   IRReader
diff --git a/clang/test/CodeGenHLSL/builtins/D3DCOLORtoUBYTE4.hlsl b/clang/test/CodeGenHLSL/builtins/D3DCOLORtoUBYTE4.hlsl
index 990f0aa910f30..d376efa9307db 100644
--- a/clang/test/CodeGenHLSL/builtins/D3DCOLORtoUBYTE4.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/D3DCOLORtoUBYTE4.hlsl
@@ -3,7 +3,7 @@
 // RUN:   -emit-llvm -O1 -o - | FileCheck %s --check-prefixes=CHECK
 
 // CHECK-LABEL: D3DCOLORtoUBYTE4
-int4 test_D3DCOLORtoUBYTE4(float4 p1) {
+export int4 test_D3DCOLORtoUBYTE4(float4 p1) {
   // CHECK: %[[SCALED:.*]] = fmul [[FMFLAGS:.*]][[FLOAT_TYPE:<4 x float>]] %{{.*}}, splat (float 0x406FE01000000000)
   // CHECK: %[[CONVERTED:.*]] = fptoui [[FLOAT_TYPE]] %[[SCALED]] to [[INT_TYPE:<4 x i32>]]
   // CHECK: %[[SHUFFLED:.*]] = shufflevector [[INT_TYPE]] %[[CONVERTED]], [[INT_TYPE]] poison, <4 x i32> <i32 2, i32 1, i32 0, i32 3>
diff --git a/clang/test/CodeGenHLSL/builtins/and.hlsl b/clang/test/CodeGenHLSL/builtins/and.hlsl
index b77889cd9ae70..7c008f8eee469 100644
--- a/clang/test/CodeGenHLSL/builtins/and.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/and.hlsl
@@ -9,7 +9,7 @@
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and i1 [[X]], [[Y]]
 // CHECK-NEXT:    ret i1 [[HLSL_AND]]
 //
-bool test_and_scalar(bool x, bool y) {
+export bool test_and_scalar(bool x, bool y) {
   return and(x, y);
 }
 
@@ -19,7 +19,7 @@ bool test_and_scalar(bool x, bool y) {
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and <2 x i1> [[X]], [[Y]]
 // CHECK-NEXT:    ret <2 x i1> [[HLSL_AND]]
 //
-bool2 test_and_bool2(bool2 x, bool2 y) {
+export bool2 test_and_bool2(bool2 x, bool2 y) {
   return and(x, y);
 }
 
@@ -29,7 +29,7 @@ bool2 test_and_bool2(bool2 x, bool2 y) {
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and <3 x i1> [[X]], [[Y]]
 // CHECK-NEXT:    ret <3 x i1> [[HLSL_AND]]
 //
-bool3 test_and_bool3(bool3 x, bool3 y) {
+export bool3 test_and_bool3(bool3 x, bool3 y) {
   return and(x, y);
 }
 
@@ -39,7 +39,7 @@ bool3 test_and_bool3(bool3 x, bool3 y) {
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and <4 x i1> [[X]], [[Y]]
 // CHECK-NEXT:    ret <4 x i1> [[HLSL_AND]]
 //
-bool4 test_and_bool4(bool4 x, bool4 y) {
+export bool4 test_and_bool4(bool4 x, bool4 y) {
   return and(x, y);
 }
 
@@ -51,7 +51,7 @@ bool4 test_and_bool4(bool4 x, bool4 y) {
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and <4 x i1> [[TOBOOL]], [[TOBOOL1]]
 // CHECK-NEXT:    ret <4 x i1> [[HLSL_AND]]
 //
-bool4 test_and_int4(int4 x, int4 y) {
+export bool4 test_and_int4(int4 x, int4 y) {
   return and(x, y);
 }
 
@@ -63,6 +63,6 @@ bool4 test_and_int4(int4 x, int4 y) {
 // CHECK-NEXT:    [[HLSL_AND:%.*]] = and <4 x i1> [[TOBOOL]], [[TOBOOL1]]
 // CHECK-NEXT:    ret <4 x i1> [[HLSL_AND]]
 //
-bool4 test_and_float4(float4 x, float4 y) {
+export bool4 test_and_float4(float4 x, float4 y) {
   return and(x, y);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/asfloat.hlsl b/clang/test/CodeGenHLSL/builtins/asfloat.hlsl
index 59fc15fa60b1e..f991eb8f78e61 100644
--- a/clang/test/CodeGenHLSL/builtins/asfloat.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/asfloat.hlsl
@@ -2,39 +2,39 @@
 
 // CHECK: define {{.*}}test_uint{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK: bitcast i32 [[VAL]] to float
-float test_uint(uint p0) {
+export float test_uint(uint p0) {
   return asfloat(p0);
 }
 
 // CHECK: define {{.*}}test_int{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK: bitcast i32 [[VAL]] to float
-float test_int(int p0) {
+export float test_int(int p0) {
   return asfloat(p0);
 }
 
 // CHECK: define {{.*}}test_float{{.*}}(float {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK-NOT: bitcast
 // CHECK: ret float [[VAL]]
-float test_float(float p0) {
+export float test_float(float p0) {
   return asfloat(p0);
 }
 
 // CHECK: define {{.*}}test_vector_uint{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK: bitcast <4 x i32> [[VAL]] to <4 x float>
 
-float4 test_vector_uint(uint4 p0) {
+export float4 test_vector_uint(uint4 p0) {
   return asfloat(p0);
 }
 
 // CHECK: define {{.*}}test_vector_int{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK: bitcast <4 x i32> [[VAL]] to <4 x float>
-float4 test_vector_int(int4 p0) {
+export float4 test_vector_int(int4 p0) {
   return asfloat(p0);
 }
 
 // CHECK: define {{.*}}test_vector_float{{.*}}(<4 x float> {{.*}} [[VAL:%.*]]){{.*}} 
 // CHECK-NOT: bitcast
 // CHECK: ret <4 x float> [[VAL]]
-float4 test_vector_float(float4 p0) {
+export float4 test_vector_float(float4 p0) {
   return asfloat(p0);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/asint.hlsl b/clang/test/CodeGenHLSL/builtins/asint.hlsl
index e1d80df5015c9..a5d3320a3f580 100644
--- a/clang/test/CodeGenHLSL/builtins/asint.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/asint.hlsl
@@ -3,39 +3,39 @@
 // CHECK: define {{.*}}test_int{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret i32 [[VAL]]
-int test_int(int p0) {
+export int test_int(int p0) {
   return asint(p0);
 }
 
 // CHECK: define {{.*}}test_uint{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret i32 [[VAL]]
-int test_uint(uint p0) {
+export int test_uint(uint p0) {
   return asint(p0);
 }
 
 // CHECK: define {{.*}}test_float{{.*}}(float {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK: bitcast float [[VAL]] to i32
-int test_float(float p0) {
+export int test_float(float p0) {
   return asint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_int{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret <4 x i32> [[VAL]]
-int4 test_vector_int(int4 p0) {
+export int4 test_vector_int(int4 p0) {
   return asint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_uint{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret <4 x i32> [[VAL]]
-int4 test_vector_uint(uint4 p0) {
+export int4 test_vector_uint(uint4 p0) {
   return asint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_float{{.*}}(<4 x float> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK: bitcast <4 x float> [[VAL]] to <4 x i32>
-int4 test_vector_float(float4 p0) {
+export int4 test_vector_float(float4 p0) {
   return asint(p0);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/asint16.hlsl b/clang/test/CodeGenHLSL/builtins/asint16.hlsl
index 1d35125bfb8cc..63f99ec5b53ad 100644
--- a/clang/test/CodeGenHLSL/builtins/asint16.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/asint16.hlsl
@@ -5,7 +5,7 @@
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret i16 [[VAL]]
-int16_t test_int(int16_t p0)
+export int16_t test_int(int16_t p0)
 {
     return asint16(p0);
 }
@@ -15,7 +15,7 @@ int16_t test_int(int16_t p0)
 //CHECK-NOT:bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret i16 [[VAL]]
-int16_t test_uint(uint16_t p0)
+export int16_t test_uint(uint16_t p0)
 {
     return asint16(p0);
 }
@@ -24,7 +24,7 @@ int16_t test_uint(uint16_t p0)
 //CHECK-SAME: {{.*}}(half {{.*}} [[VAL:%.*]]){{.*}}
 //CHECK: [[RES:%.*]] = bitcast half [[VAL]] to i16
 //CHECK-NEXT : ret i16 [[RES]]
-int16_t test_half(half p0)
+export int16_t test_half(half p0)
 {
     return asint16(p0);
 }
@@ -34,7 +34,7 @@ int16_t test_half(half p0)
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret <4 x i16> [[VAL]]
-int16_t4 test_vector_int(int16_t4 p0)
+export int16_t4 test_vector_int(int16_t4 p0)
 {
     return asint16(p0);
 }
@@ -44,7 +44,7 @@ int16_t4 test_vector_int(int16_t4 p0)
 //CHECK-NOT: bitcast
 //CHECK-NEXT: entry:
 //CHECK-NEXT: ret <4 x i16> [[VAL]]
-int16_t4 test_vector_uint(uint16_t4 p0)
+export int16_t4 test_vector_uint(uint16_t4 p0)
 {
     return asint16(p0);
 }
@@ -53,7 +53,7 @@ int16_t4 test_vector_uint(uint16_t4 p0)
 //CHECK-SAME: {{.*}}(<4 x half> {{.*}} [[VAL:%.*]]){{.*}}
 //CHECK: [[RES:%.*]] = bitcast <4 x half> [[VAL]] to <4 x i16>
 //CHECK-NEXT: ret <4 x i16> [[RES]]
-int16_t4 fn(half4 p1)
+export int16_t4 fn(half4 p1)
 {
     return asint16(p1);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/asuint.hlsl b/clang/test/CodeGenHLSL/builtins/asuint.hlsl
index 252a434ccce0d..a56271e4ad033 100644
--- a/clang/test/CodeGenHLSL/builtins/asuint.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/asuint.hlsl
@@ -3,39 +3,39 @@
 // CHECK: define {{.*}}test_uint{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret i32 [[VAL]]
-uint test_uint(uint p0) {
+export uint test_uint(uint p0) {
   return asuint(p0);
 }
 
 // CHECK: define {{.*}}test_int{{.*}}(i32 {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret i32 [[VAL]]
-uint test_int(int p0) {
+export uint test_int(int p0) {
   return asuint(p0);
 }
 
 // CHECK: define {{.*}}test_float{{.*}}(float {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK: bitcast float [[VAL]] to i32
-uint test_float(float p0) {
+export uint test_float(float p0) {
   return asuint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_uint{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret <4 x i32> [[VAL]]
-uint4 test_vector_uint(uint4 p0) {
+export uint4 test_vector_uint(uint4 p0) {
   return asuint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_int{{.*}}(<4 x i32> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK-NOT: bitcast
 // CHECK: ret <4 x i32> [[VAL]]
-uint4 test_vector_int(int4 p0) {
+export uint4 test_vector_int(int4 p0) {
   return asuint(p0);
 }
 
 // CHECK: define {{.*}}test_vector_float{{.*}}(<4 x float> {{.*}} [[VAL:%.*]]){{.*}}
 // CHECK: bitcast <4 x float> [[VAL]] to <4 x i32>
-uint4 test_vector_float(float4 p0) {
+export uint4 test_vector_float(float4 p0) {
   return asuint(p0);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/asuint16.hlsl b/clang/test/CodeGenHLSL/builtins/asuint16.hlsl
index 3ed7de9dffbe5..8b698d6a983e1 100644
--- a/clang/test/CodeGenHLSL/builtins/asuint16.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/asuint16.hlsl
@@ -5,7 +5,7 @@
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK: ret i16 [[VAL]]
-uint16_t test_int(int16_t p0)
+export uint16_t test_int(int16_t p0)
 {
     return asuint16(p0);
 }
@@ -15,7 +15,7 @@ uint16_t test_int(int16_t p0)
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret i16 [[VAL]]
-uint16_t test_uint(uint16_t p0)
+export uint16_t test_uint(uint16_t p0)
 {
     return asuint16(p0);
 }
@@ -24,7 +24,7 @@ uint16_t test_uint(uint16_t p0)
 //CHECK-SAME: {{.*}}(half {{.*}} [[VAL:%.*]]){{.*}}
 //CHECK: [[RES:%.*]] = bitcast half [[VAL]] to i16
 //CHECK-NEXT: ret i16 [[RES]]
-uint16_t test_half(half p0)
+export uint16_t test_half(half p0)
 {
     return asuint16(p0);
 }
@@ -34,7 +34,7 @@ uint16_t test_half(half p0)
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret <4 x i16> [[VAL]]
-uint16_t4 test_vector_int(int16_t4 p0)
+export uint16_t4 test_vector_int(int16_t4 p0)
 {
     return asuint16(p0);
 }
@@ -44,7 +44,7 @@ uint16_t4 test_vector_int(int16_t4 p0)
 //CHECK-NOT: bitcast
 //CHECK: entry:
 //CHECK-NEXT: ret <4 x i16> [[VAL]]
-uint16_t4 test_vector_uint(uint16_t4 p0)
+export uint16_t4 test_vector_uint(uint16_t4 p0)
 {
     return asuint16(p0);
 }
@@ -53,7 +53,7 @@ uint16_t4 test_vector_uint(uint16_t4 p0)
 //CHECK-SAME: {{.*}}(<4 x half> {{.*}} [[VAL:%.*]]){{.*}}
 //CHECK: [[RES:%.*]] = bitcast <4 x half> [[VAL]] to <4 x i16>
 //CHECK-NEXT: ret <4 x i16> [[RES]]
-uint16_t4 fn(half4 p1)
+export uint16_t4 fn(half4 p1)
 {
     return asuint16(p1);
 }
diff --git a/clang/test/CodeGenHLSL/builtins/clip.hlsl b/clang/test/CodeGenHLSL/builtins/clip.hlsl
index 5a1753766a8a1..155a38de0a35f 100644
--- a/clang/test/CodeGenHLSL/builtins/clip.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/clip.hlsl
@@ -2,7 +2,7 @@
 // RUN: %clang_cc1 -finclude-default-header -triple spirv-vulkan-pixel %s -fnative-half-type -emit-llvm -o - | FileCheck %s --check-prefix=SPIRV
 
 
-void test_scalar(float Buf) {
+export void test_scalar(float Buf) {
   // CHECK:      define void @{{.*}}test_scalar{{.*}}(float {{.*}} [[VALP:%.*]])
   // CHECK:      [[LOAD:%.*]] = load float, ptr [[VALP]].addr, align 4
   // CHECK-NEXT: [[FCMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn olt float [[LOAD]], 0.000000e+00
@@ -20,7 +20,7 @@ void test_scalar(float Buf) {
   clip(Buf);
 }
 
-void test_vector4(float4 Buf) {
+export void test_vector4(float4 Buf) {
   // CHECK:      define void @{{.*}}test_vector{{.*}}(<4 x float> {{.*}} [[VALP:%.*]])
   // CHECK:      [[LOAD:%.*]] = load <4 x float>, ptr [[VALP]].addr, align 16
   // CHECK-NEXT: [[FCMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn olt <4 x float> [[LOAD]], zeroinitializer
diff --git a/clang/test/CodeGenHLSL/builtins/distance.hlsl b/clang/test/CodeGenHLSL/builtins/distance.hlsl
index e830903261c8c..437fbec095126 100644
--- a/clang/test/CodeGenHLSL/builtins/distance.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/distance.hlsl
@@ -20,7 +20,7 @@
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.fabs.f16(half [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[ELT_ABS_I]]
 //
-half test_distance_half(half X, half Y) { return distance(X, Y); }
+export half test_distance_half(half X, half Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) half @_Z19test_distance_half2Dv2_DhS_(
 // CHECK-SAME: <2 x half> noundef nofpclass(nan inf) [[X:%.*]], <2 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -37,7 +37,7 @@ half test_distance_half(half X, half Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v2f16(<2 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
-half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
+export half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) half @_Z19test_distance_half3Dv3_DhS_(
 // CHECK-SAME: <3 x half> noundef nofpclass(nan inf) [[X:%.*]], <3 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -54,7 +54,7 @@ half test_distance_half2(half2 X, half2 Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v3f16(<3 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
-half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
+export half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) half @_Z19test_distance_half4Dv4_DhS_(
 // CHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[X:%.*]], <4 x half> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -71,7 +71,7 @@ half test_distance_half3(half3 X, half3 Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef half @llvm.spv.length.v4f16(<4 x half> [[SUB_I]])
 // SPVCHECK-NEXT:    ret half [[SPV_LENGTH_I]]
 //
-half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
+export half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) float @_Z19test_distance_floatff(
 // CHECK-SAME: float noundef nofpclass(nan inf) [[X:%.*]], float noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -87,7 +87,7 @@ half test_distance_half4(half4 X, half4 Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[ELT_ABS_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.fabs.f32(float [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[ELT_ABS_I]]
 //
-float test_distance_float(float X, float Y) { return distance(X, Y); }
+export float test_distance_float(float X, float Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) float @_Z20test_distance_float2Dv2_fS_(
 // CHECK-SAME: <2 x float> noundef nofpclass(nan inf) [[X:%.*]], <2 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -104,7 +104,7 @@ float test_distance_float(float X, float Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v2f32(<2 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
-float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
+export float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) float @_Z20test_distance_float3Dv3_fS_(
 // CHECK-SAME: <3 x float> noundef nofpclass(nan inf) [[X:%.*]], <3 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -121,7 +121,7 @@ float test_distance_float2(float2 X, float2 Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v3f32(<3 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
-float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
+export float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 
 // CHECK-LABEL: define noundef nofpclass(nan inf) float @_Z20test_distance_float4Dv4_fS_(
 // CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[X:%.*]], <4 x float> noundef nofpclass(nan inf) [[Y:%.*]]) local_unnamed_addr #[[ATTR0]] {
@@ -138,4 +138,4 @@ float test_distance_float3(float3 X, float3 Y) { return distance(X, Y); }
 // SPVCHECK-NEXT:    [[SPV_LENGTH_I:%.*]] = tail call reassoc nnan ninf nsz arcp afn noundef float @llvm.spv.length.v4f32(<4 x float> [[SUB_I]])
 // SPVCHECK-NEXT:    ret float [[SPV_LENGTH_I]]
 //
-float test_distance_float4(float4 X, float4 Y) { return distance(X, Y); }
+export float test_distance_float4(float4 X, float4 Y) { return distance(X, Y); }
diff --git a/clang/test/CodeGenHLSL/builtins/fmod.hlsl b/clang/test/CodeGenHLSL/builtins/fmod.hlsl
index 7ecc5854b3988..76b1ddcd2b50a 100644
--- a/clang/test/CodeGenHLSL/builtins/fmod.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/fmod.hlsl
@@ -47,7 +47,7 @@
 // CHECK: define [[FNATTRS]] [[TYPE]] @
 // CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn [[TYPE]]
 // CHECK: ret [[TYPE]] %fmod.i
-half test_fmod_half(half p0, half p1) { return fmod(p0, p1); }
+export half test_fmod_half(half p0, half p1) { return fmod(p0, p1); }
 
 // DXCHECK: define [[FNATTRS]] <2 x [[TYPE]]> @
 // DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]> %{{.*}}, %{{.*}}
@@ -61,7 +61,7 @@ half test_fmod_half(half p0, half p1) { return fmod(p0, p1); }
 // CHECK: define [[FNATTRS]] <2 x [[TYPE]]> @
 // CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
 // CHECK: ret <2 x [[TYPE]]> %fmod.i
-half2 test_fmod_half2(half2 p0, half2 p1) { return fmod(p0, p1); }
+export half2 test_fmod_half2(half2 p0, half2 p1) { return fmod(p0, p1); }
 
 // DXCHECK: define [[FNATTRS]] <3 x [[TYPE]]> @
 // DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]> %{{.*}}, %{{.*}}
@@ -75,7 +75,7 @@ half2 test_fmod_half2(half2 p0, half2 p1) { return fmod(p0, p1); }
 // CHECK: define [[FNATTRS]] <3 x [[TYPE]]> @
 // CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
 // CHECK: ret <3 x [[TYPE]]> %fmod.i
-half3 test_fmod_half3(hal...
[truncated]

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we're changing the linkage of functions in an IR pass, instead of just setting the correct linkage in the first place. Is there some form of linker involved? If there is, should the linker fix the linkage?

By convention, passes which are required for correctness are part of the backend, so you can't accidentally skip running them. (This isn't 100%, but we try when possible.) If this needs to be target-independent, you can stick it in the target-independent TargetMachine code that computes the pass pipeline. Adding it in clang means every other frontend needs an equivalent change.

@s-perron
Copy link
Contributor Author

s-perron commented Apr 8, 2025

I'm following a design that was put in place by others, so I cannot fully answer all of the questions. However, I'll do my best.

Is there some form of linker involved?

No linker is involved yet, but I believe the long term plan is to have some type of linker. When that is available, this pass will become part of it.

By convention, passes which are required for correctness are part of the backend, so you can't accidentally skip running them.

I want it to be part of clang because it is the HLSL language that is causing the problem. I would prefer to keep the backend language agnostic as much as possible.

The SPIR-V backend adds the export linkage decoration to function with external linkage in llvm. This is the expected behaviour for OpenCL. However, it is incorrect for HLSL. Without this change, the backend will have to know the llvm-ir came from HLSL and look for the hlsl-export attribute. I don't want to do that.

@efriedma-quic
Copy link
Collaborator

The idea that a symbol should be externally visible before linking, and not externally visible afterwards, isn't new: many platforms have some form of symbol visibility. I'm not sure why you don't want to express that in LLVM IR... having an implicit HLSL-specific rule like this just makes it harder to understand what's happening.

@s-perron
Copy link
Contributor Author

The idea that a symbol should be externally visible before linking, and not externally visible afterwards, isn't new: many platforms have some form of symbol visibility. I'm not sure why you don't want to express that in LLVM IR... having an implicit HLSL-specific rule like this just makes it harder to understand what's happening.

That is something for @llvm-beanz to answer. I'm just working with what was already designed.

@s-perron s-perron closed this Apr 30, 2025
@s-perron s-perron deleted the handle_wrapper branch April 30, 2025 13:53
@s-perron s-perron restored the handle_wrapper branch April 30, 2025 17:01
@s-perron s-perron reopened this Apr 30, 2025
bogner added a commit to bogner/llvm-project that referenced this pull request May 1, 2025
…sics

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage back to
it's pre-llvm#136244 place in the pipeline. Doing this avoids issues with DXIL
passes running on obviously dead code, and makes it more clear what
DXILFinalizeLinkage is really doing.

This also helps with the story for llvm#134260, as cleaning up dead intrinsics
doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers most of
the testing here. It'd be nice to have something that catches the regression
from changing the pass ordering but I couldn't come up with anything that
wouldn't be incredibly fragile.

Fixes llvm#138180.
bogner added a commit that referenced this pull request May 2, 2025
…sics (#138199)

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.

This also helps with the story for #134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.

Fixes #138180.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…sics (llvm#138199)

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-llvm#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.

This also helps with the story for llvm#134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.

Fixes llvm#138180.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…sics (llvm#138199)

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-llvm#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.

This also helps with the story for llvm#134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.

Fixes llvm#138180.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…sics (llvm#138199)

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-llvm#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.

This also helps with the story for llvm#134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.

Fixes llvm#138180.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…sics (llvm#138199)

This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-llvm#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.

This also helps with the story for llvm#134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.

Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.

Fixes llvm#138180.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category HLSL HLSL Language Support llvm:transforms
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

3 participants