Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[clang] Restrict the use of scalar types in vector builtins #119423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 29, 2025

Conversation

frasercrmck
Copy link
Contributor

@frasercrmck frasercrmck commented Dec 10, 2024

This commit restricts the use of scalar types in vector math builtins, particularly the __builtin_elementwise_* builtins.

Previously, small scalar integer types would be promoted to int, as per the usual conversions. This would silently do the wrong thing for certain operations, such as add_sat, popcount, bitreverse, and others. Similarly, since unsigned integer types were promoted to int, something like add_sat(unsigned char, unsigned char) would perform a signed operation.

With this patch, promotable scalar integer types are not promoted to int, and are kept intact. If any of the types differ in the binary and ternary builtins, an error is issued. Similarly an error is issued if builtins are supplied integer types of different signs. Mixing enums of different types in binary/ternary builtins now consistently raises an error in all language modes.

This brings the behaviour surrounding scalar types more in line with that of vector types. No change is made to vector types, which are both not promoted and whose element types must match.

Fixes #84047.

RFC: https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725

@frasercrmck frasercrmck added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Dec 10, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Dec 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2024

@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-clang

Author: Fraser Cormack (frasercrmck)

Changes

These builtins would unconditionally perform the usual arithmetic conversions on promotable scalar integer arguments. This meant in practice that char and short arguments were promoted to int, and the operation was truncated back down afterwards. This in effect silently replaced a saturating add/sub with a regular add/sub, which is not intuitive (or intended) behaviour.

With this patch, promotable scalar integer types are not promoted to int, but are kept intact. If the types differ, the smaller integer is promoted to the larger one. The signedness of the operation matches the larger integer type.

No change is made to vector types, which are both not promoted and whose element types must match.


Full diff: https://github.com/llvm/llvm-project/pull/119423.diff

2 Files Affected:

  • (modified) clang/lib/Sema/SemaChecking.cpp (+38)
  • (modified) clang/test/CodeGen/builtins-elementwise-math.c (+96-2)
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index a248a6b53b0d06..6107eb854fb95e 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2765,6 +2765,44 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
   // types only.
   case Builtin::BI__builtin_elementwise_add_sat:
   case Builtin::BI__builtin_elementwise_sub_sat: {
+    if (checkArgCount(TheCall, 2))
+      return ExprError();
+    ExprResult LHS = TheCall->getArg(0);
+    ExprResult RHS = TheCall->getArg(1);
+    QualType LHSType = LHS.get()->getType().getUnqualifiedType();
+    QualType RHSType = RHS.get()->getType().getUnqualifiedType();
+    // If both LHS/RHS are promotable integer types, do not perform the usual
+    // conversions - we must keep the saturating operation at the correct
+    // bitwidth.
+    if (Context.isPromotableIntegerType(LHSType) &&
+        Context.isPromotableIntegerType(RHSType)) {
+      // First, convert each argument to an r-value.
+      ExprResult ResLHS = DefaultFunctionArrayLvalueConversion(LHS.get());
+      if (ResLHS.isInvalid())
+        return ExprError();
+      LHS = ResLHS.get();
+
+      ExprResult ResRHS = DefaultFunctionArrayLvalueConversion(RHS.get());
+      if (ResRHS.isInvalid())
+        return ExprError();
+      RHS = ResRHS.get();
+
+      LHSType = LHS.get()->getType().getUnqualifiedType();
+      RHSType = RHS.get()->getType().getUnqualifiedType();
+
+      // If the two integer types are not of equal order, cast the smaller
+      // integer one to the larger one
+      if (int Order = Context.getIntegerTypeOrder(LHSType, RHSType); Order == 1)
+        RHS = ImpCastExprToType(RHS.get(), LHSType, CK_IntegralCast);
+      else if (Order == -1)
+        LHS = ImpCastExprToType(LHS.get(), RHSType, CK_IntegralCast);
+
+      TheCall->setArg(0, LHS.get());
+      TheCall->setArg(1, RHS.get());
+      TheCall->setType(LHS.get()->getType().getUnqualifiedType());
+      break;
+    }
+
     if (BuiltinElementwiseMath(TheCall))
       return ExprError();
 
diff --git a/clang/test/CodeGen/builtins-elementwise-math.c b/clang/test/CodeGen/builtins-elementwise-math.c
index 7f6b5f26eb9307..4ac6fe18c4d5a3 100644
--- a/clang/test/CodeGen/builtins-elementwise-math.c
+++ b/clang/test/CodeGen/builtins-elementwise-math.c
@@ -68,7 +68,10 @@ void test_builtin_elementwise_add_sat(float f1, float f2, double d1, double d2,
                                       long long int i2, si8 vi1, si8 vi2,
                                       unsigned u1, unsigned u2, u4 vu1, u4 vu2,
                                       _BitInt(31) bi1, _BitInt(31) bi2,
-                                      unsigned _BitInt(55) bu1, unsigned _BitInt(55) bu2) {
+                                      unsigned _BitInt(55) bu1, unsigned _BitInt(55) bu2,
+                                      char c1, char c2, unsigned char uc1,
+                                      unsigned char uc2, short s1, short s2,
+                                      unsigned short us1, unsigned short us2) {
   // CHECK:      [[I1:%.+]] = load i64, ptr %i1.addr, align 8
   // CHECK-NEXT: [[I2:%.+]] = load i64, ptr %i2.addr, align 8
   // CHECK-NEXT: call i64 @llvm.sadd.sat.i64(i64 [[I1]], i64 [[I2]])
@@ -114,6 +117,50 @@ void test_builtin_elementwise_add_sat(float f1, float f2, double d1, double d2,
 
   // CHECK: store i64 98, ptr %i1.addr, align 8
   i1 = __builtin_elementwise_add_sat(1, 'a');
+
+  // CHECK:      [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C2:%.+]] = load i8, ptr %c2.addr, align 1
+  // CHECK-NEXT: call i8 @llvm.sadd.sat.i8(i8 [[C1]], i8 [[C2]])
+  c1 = __builtin_elementwise_add_sat(c1, c2);
+
+  // CHECK:      [[UC1:%.+]] = load i8, ptr %uc1.addr, align 1
+  // CHECK-NEXT: [[UC2:%.+]] = load i8, ptr %uc2.addr, align 1
+  // CHECK-NEXT: call i8 @llvm.uadd.sat.i8(i8 [[UC1]], i8 [[UC2]])
+  uc1 = __builtin_elementwise_add_sat(uc1, uc2);
+
+  // CHECK:      [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: [[S2:%.+]] = load i16, ptr %s2.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.sadd.sat.i16(i16 [[S1]], i16 [[S2]])
+  s1 = __builtin_elementwise_add_sat(s1, s2);
+
+  // CHECK:      [[US1:%.+]] = load i16, ptr %us1.addr, align 2
+  // CHECK-NEXT: [[US2:%.+]] = load i16, ptr %us2.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.uadd.sat.i16(i16 [[US1]], i16 [[US2]])
+  us1 = __builtin_elementwise_add_sat(us1, us2);
+
+  // CHECK:      [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.sadd.sat.i16(i16 [[C1EXT]], i16 [[S1]])
+  s1 = __builtin_elementwise_add_sat(c1, s1);
+
+  // CHECK:      [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: call i16 @llvm.sadd.sat.i16(i16 [[S1]], i16 [[C1EXT]])
+  s1 = __builtin_elementwise_add_sat(s1, c1);
+
+  // CHECK:      [[UC1:%.+]] = load i8, ptr %uc1.addr, align 1
+  // CHECK-NEXT: [[UC1EXT:%.+]] = zext i8 [[UC1]] to i16
+  // CHECK-NEXT: [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.sadd.sat.i16(i16 [[UC1EXT]], i16 [[S1]])
+  s1 = __builtin_elementwise_add_sat(uc1, s1);
+
+  // CHECK:      [[US1:%.+]] = load i16, ptr %us1.addr, align 2
+  // CHECK-NEXT: [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: call i16 @llvm.uadd.sat.i16(i16 [[US1]], i16 [[C1EXT]])
+  us1 = __builtin_elementwise_add_sat(us1, c1);
 }
 
 void test_builtin_elementwise_sub_sat(float f1, float f2, double d1, double d2,
@@ -121,7 +168,10 @@ void test_builtin_elementwise_sub_sat(float f1, float f2, double d1, double d2,
                                       long long int i2, si8 vi1, si8 vi2,
                                       unsigned u1, unsigned u2, u4 vu1, u4 vu2,
                                       _BitInt(31) bi1, _BitInt(31) bi2,
-                                      unsigned _BitInt(55) bu1, unsigned _BitInt(55) bu2) {
+                                      unsigned _BitInt(55) bu1, unsigned _BitInt(55) bu2,
+                                      char c1, char c2, unsigned char uc1,
+                                      unsigned char uc2, short s1, short s2,
+                                      unsigned short us1, unsigned short us2) {
   // CHECK:      [[I1:%.+]] = load i64, ptr %i1.addr, align 8
   // CHECK-NEXT: [[I2:%.+]] = load i64, ptr %i2.addr, align 8
   // CHECK-NEXT: call i64 @llvm.ssub.sat.i64(i64 [[I1]], i64 [[I2]])
@@ -167,6 +217,50 @@ void test_builtin_elementwise_sub_sat(float f1, float f2, double d1, double d2,
 
   // CHECK: store i64 -96, ptr %i1.addr, align 8
   i1 = __builtin_elementwise_sub_sat(1, 'a');
+
+  // CHECK:      [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C2:%.+]] = load i8, ptr %c2.addr, align 1
+  // CHECK-NEXT: call i8 @llvm.ssub.sat.i8(i8 [[C1]], i8 [[C2]])
+  c1 = __builtin_elementwise_sub_sat(c1, c2);
+
+  // CHECK:      [[UC1:%.+]] = load i8, ptr %uc1.addr, align 1
+  // CHECK-NEXT: [[UC2:%.+]] = load i8, ptr %uc2.addr, align 1
+  // CHECK-NEXT: call i8 @llvm.usub.sat.i8(i8 [[UC1]], i8 [[UC2]])
+  uc1 = __builtin_elementwise_sub_sat(uc1, uc2);
+
+  // CHECK:      [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: [[S2:%.+]] = load i16, ptr %s2.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.ssub.sat.i16(i16 [[S1]], i16 [[S2]])
+  s1 = __builtin_elementwise_sub_sat(s1, s2);
+
+  // CHECK:      [[US1:%.+]] = load i16, ptr %us1.addr, align 2
+  // CHECK-NEXT: [[US2:%.+]] = load i16, ptr %us2.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.usub.sat.i16(i16 [[US1]], i16 [[US2]])
+  us1 = __builtin_elementwise_sub_sat(us1, us2);
+
+  // CHECK:      [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.ssub.sat.i16(i16 [[C1EXT]], i16 [[S1]])
+  s1 = __builtin_elementwise_sub_sat(c1, s1);
+
+  // CHECK:      [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: call i16 @llvm.ssub.sat.i16(i16 [[S1]], i16 [[C1EXT]])
+  s1 = __builtin_elementwise_sub_sat(s1, c1);
+
+  // CHECK:      [[UC1:%.+]] = load i8, ptr %uc1.addr, align 1
+  // CHECK-NEXT: [[UC1EXT:%.+]] = zext i8 [[UC1]] to i16
+  // CHECK-NEXT: [[S1:%.+]] = load i16, ptr %s1.addr, align 2
+  // CHECK-NEXT: call i16 @llvm.ssub.sat.i16(i16 [[UC1EXT]], i16 [[S1]])
+  s1 = __builtin_elementwise_sub_sat(uc1, s1);
+
+  // CHECK:      [[US1:%.+]] = load i16, ptr %us1.addr, align 2
+  // CHECK-NEXT: [[C1:%.+]] = load i8, ptr %c1.addr, align 1
+  // CHECK-NEXT: [[C1EXT:%.+]] = sext i8 [[C1]] to i16
+  // CHECK-NEXT: call i16 @llvm.usub.sat.i16(i16 [[US1]], i16 [[C1EXT]])
+  us1 = __builtin_elementwise_sub_sat(us1, c1);
 }
 
 void test_builtin_elementwise_maximum(float f1, float f2, double d1, double d2,

@frasercrmck frasercrmck requested review from fhahn and RKSimon December 11, 2024 11:50
@frasercrmck
Copy link
Contributor Author

frasercrmck commented Dec 11, 2024

I just realised this is also a problem with the bitreverse and popcount builtins on signed types for the same reason. Sign-extending char to int and performing either of these operations then truncating the result back is not the same operation as performing it on the smaller type directly.

@RKSimon
Copy link
Collaborator

RKSimon commented Dec 11, 2024

I thought bitreverse/popcount are only supposed to work with unsigned types?

@frasercrmck
Copy link
Contributor Author

I thought bitreverse/popcount are only supposed to work with unsigned types?

Good question. The documentation doesn't mention they're only for signed types, as abs does, for example. So if there is such a restriction, it's not documented, nor checked in the code.

Tangentially, in OpenCL (where I'm coming from) there's no restriction on popcount to be only on unsigned types.

@efriedma-quic
Copy link
Collaborator

Even with this fix, the behavior with mixed types still seems really confusing, especially if you mix signed/unsigned inputs. Can we address that somehow?

@frasercrmck
Copy link
Contributor Author

Even with this fix, the behavior with mixed types still seems really confusing, especially if you mix signed/unsigned inputs. Can we address that somehow?

I totally agree.

The (elementwise) builtins won't let you mix ext_vector_types of different signs or element sizes, so maybe we should extend logic that to all scalars too? Ban mixed signs? Ban mixed sizes entirely? This would be a larger change that would probably need documenting - an RFC? Worth noting perhaps that maybe the current vector behaviour depends on the type of vectors you use (OpenCL, AltiVec, GCC, NEON, SVE) - I haven't checked that.

I note also that the documentation says that For scalar types, consider the operation applied to a vector with a single element.. This is misleading, as the usual implicit conversions don't behave the same way for scalars as they do for vectors.

CC @arsenm - might be of interest to you.

@efriedma-quic
Copy link
Collaborator

If people aren't using the edge cases, they won't notice if we start error'ing out. If they are, it's pretty easy to adapt the code to build with both old and new compilers. A quick GitHub search shows exactly one user of the builtins outside of clang itself, and that code uses a vector input.

Maybe makes sense to RFC it for wider exposure, but I doubt anyone will object.

We should probably write a release note for this in any case.

@frasercrmck
Copy link
Contributor Author

RFC: https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725. I don't expect anyone to say anything at all, tbh, especially during the holidays.

If people aren't using the edge cases, they won't notice if we start error'ing out. If they are, it's pretty easy to adapt the code to build with both old and new compilers. A quick GitHub search shows exactly one user of the builtins outside of clang itself, and that code uses a vector input.

That's useful context, thank you

We should probably write a release note for this in any case.

Makes sense. I'll implement the RFC in the new year. We might run into further questions about what to do with enums and other things currently handled for us by the usual unary conversions. I think we'll have to roll our own solution as I haven't seen anything suitable. My knowledge of Sema isn't great, though.

@frasercrmck frasercrmck force-pushed the builtin-elementwise-sat branch from 98d796a to eace867 Compare December 18, 2024 17:21
@frasercrmck frasercrmck requested a review from Endilll as a code owner December 18, 2024 17:21
@llvmbot llvmbot added the HLSL HLSL Language Support label Dec 18, 2024
@frasercrmck frasercrmck changed the title [clang] Fix sub-integer __builtin_elementwise_(add|sub)_sat [clang] Restrict use of scalar types in vector builtins Dec 18, 2024
@@ -6,7 +6,7 @@
// CHECK: %conv2 = fptrunc double %hlsl.dot to float
// CHECK: ret float %conv2
float builtin_bool_to_float_type_promotion ( float p0, bool p1 ) {
return __builtin_hlsl_dot ( p0, p1 );
return __builtin_hlsl_dot ( (double)p0, (double)p1 );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@farzonl I've quickly hacked this in place to satisfy the tests. The HLSL builtins use the same Sema APIs as the ones I'm interested in changing. Is the old behaviour something that needs preserved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the checks it looks like you have just made the double more explicit. Codegen doesn't look different. So I'm fine with this change.

Also most of the type promotion rules for HLSL are enforced on the HLSL header. We are just using the builins as aliases for the cases we need to invoke an intrinsic.

Note to myself though these tests don't look correct and should probably be removed. DXC would have promoted bool to a float. Short of setting the builtin as CustomTypeChecking I don't think that can be achieved here. But again it does not matter because the promotion rules are handled on the hlsl dot api which only aliases to the builtin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. I think that make sense, thank you.

Note though that I had to make a change to what I think is a user-facing pow builtin here, so this PR might still impact HLSL users directly.

// Don't promote integer types
if (QualType Ty = E->getType(); S.getASTContext().isPromotableIntegerType(Ty))
return S.DefaultFunctionArrayLvalueConversion(E);
return S.UsualUnaryConversions(E);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm following correctly, this specifically skips promoting integers... but it allows promoting floats? I think I'd prefer to factor out the unary-float conversions into a separate function, and call that here, for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pretty much. UsualUnaryConversions does DefaultFunctionArrayLvalueConversion first, then floating-point specific logic. However, it also handles isPromotableBitFields.

In your other comment you ask about enums, and perhaps bitfields also falls into that category? Do we really need to support these types in vector maths/arithmetic builtins? There are tests for them, so I preserved the existing logic (though seemingly not for bitfields) but I see the argument for removing that support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we perhaps extend UsualUnaryConversions with a default-true bool flag to control the handling of integers/bitfields? I can see that it might be exposing too many internal details to users of UsualUnaryConversions, so I'm not sure. I don't know the best practices of clang/sema design here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "usual unary conversions" is not a term used by any specification, so there isn't any real meaning attached to it. I think I'd prefer not to use a boolean, though, because operation you want here is different in a pretty subtle way.

I missed that this also impacts bitfields.

Whether we support enums here probably doesn't really matter, as long as we're consistent. Bitfields are unfortunately weird, due to the way promotion works; we might want to reject due to the potential ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense. I've added Sema::UsualUnaryFPConversions and am using that where appropriate. I didn't want to fold in the use of DefaultFunctionArrayLvalueConversion into that even though that would have saved a bit of duplicated code.

Regarding bitfields: with the current version of this patch (i.e., without the implicit promotion) we are calling the builtin according to the type you specify in the bitfield:

struct StructWithBitfield {
  int i : 4;
  char c: 4;
  short s : 4;
};

__builtin_elementwise_abs(s.i) calls llvm.abs.i32, s.c calls llvm.abs.i8, s.s llvm.abs.i16, etc. I wonder if this is consistent enough behaviour to permit them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird that the resulting type can be larger than what integer promotion would produce (without this patch, struct S { long l : 4; }; S s; static_assert(sizeof(__builtin_elementwise_abs(s.l)) == 4);). But maybe that's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions either way. I've added some CodeGen tests for bitfields so at least the current behaviour is tested.

if (A.isInvalid() || B.isInvalid())
return true;
checkEnumArithmeticConversions(TheCall->getArg(0), TheCall->getArg(1),
TheCall->getExprLoc(), ACK_Comparison);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changning whether enums are allowed based on the language mode seems unintuitive and unnecessary. Either allow them, or forbid them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean here, sorry? I assume you mean just in the context of these vector builtins. This call looks like it's just checking the following:

  // C++2a [expr.arith.conv]p1:
  //   If one operand is of enumeration type and the other operand is of a
  //   different enumeration type or a floating-point type, this behavior is
  //   deprecated ([depr.arith.conv.enum]).
  //
  // Warn on this in all language modes. Produce a deprecation warning in C++20.
  // Eventually we will presumably reject these cases (in C++23 onwards?).

So it's not really whether "enums are allowed", but whether you're allowed to mix different enums, or enums with floating-point values, in binary builtins.

I agree the inconsistency here is not desirable, especially as we're not doing the same for the ternary elementwise builtins.

Should we just forbid this in all binary/ternary vector elementwise builtins?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've forbidden mixed enum types and have added some tests for them.

These builtins would unconditionally perform the usual arithmetic
conversions on promotable scalar integer arguments. This meant in
practice that char and short arguments were promoted to int, and the
operation was truncated back down afterwards. This in effect silently
replaced a saturating add/sub with a regular add/sub, which is not
intuitive (or intended) behaviour.

With this patch, promotable scalar integer types are not promoted to
int, but are kept intact. If the types differ, the smaller integer is
promoted to the larger one. The signedness of the operation matches the
larger integer type.

No change is made to vector types, which are both not promoted and whose
element types must match.
@frasercrmck frasercrmck force-pushed the builtin-elementwise-sat branch from eace867 to ad8ccba Compare January 6, 2025 12:37
@fhahn
Copy link
Contributor

fhahn commented Jan 6, 2025

Would be good to also update the documentation for the builtins the clarify the behavior.

@frasercrmck
Copy link
Contributor Author

Would be good to also update the documentation for the builtins the clarify the behavior.

I've added some words to that effect. Is that what you had in mind?

@Endilll Endilll removed their request for review January 15, 2025 18:37
@frasercrmck
Copy link
Contributor Author

ping

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - but we should have a frontend specialist approve

@frasercrmck
Copy link
Contributor Author

@efriedma-quic are you happy with this PR?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@frasercrmck frasercrmck changed the title [clang] Restrict use of scalar types in vector builtins [clang] Restrict the use of scalar types in vector builtins Jan 29, 2025
@frasercrmck frasercrmck merged commit 1ac3665 into llvm:main Jan 29, 2025
9 checks passed
@frasercrmck frasercrmck deleted the builtin-elementwise-sat branch January 29, 2025 09:41
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 29, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/13711

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Analysis/live-stmts.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -w -analyzer-checker=debug.DumpLiveExprs /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp 2>&1   | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -w -analyzer-checker=debug.DumpLiveExprs /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp:239:16: �[0m�[0;1;31merror: �[0m�[1mCHECK-EMPTY: is not on the line after the previous match
�[0m// CHECK-EMPTY:
�[0;1;32m               ^
�[0m�[1m<stdin>:180:1: �[0m�[0;1;30mnote: �[0m�[1m'next' match was here
�[0m
�[0;1;32m^
�[0m�[1m<stdin>:177:1: �[0m�[0;1;30mnote: �[0m�[1mprevious match ended here
�[0m
�[0;1;32m^
�[0m�[1m<stdin>:178:1: �[0m�[0;1;30mnote: �[0m�[1mnon-matching line after previous match is here
�[0mImplicitCastExpr 0x13f811978 '_Bool' <LValueToRValue>
�[0;1;32m^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/Analysis/live-stmts.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m           2: �[0m�[1m�[0;1;46m�[0m[ B0 (live expressions at block exit) ]�[0;1;46m �[0m
�[0;1;32mcheck:21      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m           3: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:22      ^
�[0m�[0;1;30m           4: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:23      ^
�[0m�[0;1;30m           5: �[0m�[1m�[0;1;46m�[0m[ B1 (live expressions at block exit) ]�[0;1;46m �[0m
�[0;1;32mcheck:24      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m           6: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:25      ^
�[0m�[0;1;30m           7: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:26      ^
�[0m�[0;1;30m           8: �[0m�[1m�[0;1;46m�[0m[ B2 (live expressions at block exit) ]�[0;1;46m �[0m
�[0;1;32mcheck:27      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m           9: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:28      ^
�[0m�[0;1;30m          10: �[0m�[1m�[0;1;46m�[0mDeclRefExpr 0x12e80b2e0 'int' lvalue ParmVar 0x12f10f270 'y' 'int'�[0;1;46m �[0m
�[0;1;32mnext:29       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m          11: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:30      ^
�[0m�[0;1;30m          12: �[0m�[1m�[0;1;46m�[0mDeclRefExpr 0x12e80b300 'int' lvalue ParmVar 0x12f10f2f0 'z' 'int'�[0;1;46m �[0m
...

@frasercrmck
Copy link
Contributor Author

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/13711
Here is the relevant piece of the build log for the reference

I don't believe this failure was caused by this PR. The test looks flaky to me. The log seems to show that the order of the output isn't deterministic

         153: [ B2 (live expressions at block exit) ] 
check:221     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         154:  
empty:222     ^
         155: IntegerLiteral 0x11e810600 'int' 0 
         156:  
         157: IntegerLiteral 0x11e810620 'int' 1 
         158:  
         159: ImplicitCastExpr 0x13f811978 '_Bool' <LValueToRValue> 
check:223     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         160: `-DeclRefExpr 0x13f811938 '_Bool' lvalue ParmVar 0x13f8117b8 'b' '_Bool' 
check:224     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         161:  
empty:225     ^
         162: ImplicitCastExpr 0x13f811990 '_Bool' <LValueToRValue> 
check:226     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         163: `-DeclRefExpr 0x13f811958 '_Bool' lvalue ParmVar 0x13f8117b8 'b' '_Bool' 
check:227     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         164:  
empty:228     ^
         165: BinaryOperator 0x13f8119a8 '_Bool' '||' 
check:229     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         166: |-ImplicitCastExpr 0x13f811978 '_Bool' <LValueToRValue> 
check:230     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         167: | `-DeclRefExpr 0x13f811938 '_Bool' lvalue ParmVar 0x13f8117b8 'b' '_Bool' 
check:231     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         168: `-ImplicitCastExpr 0x13f811990 '_Bool' <LValueToRValue> 
check:232     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         169:  `-DeclRefExpr 0x13f811958 '_Bool' lvalue ParmVar 0x13f8117b8 'b' '_Bool' 
check:233      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         170:  
empty:234     ^
         171:  
         172: [ B3 (live expressions at block exit) ] 

Whereas the checks are clearly:

// CHECK: [ B2 (live expressions at block exit) ]
// CHECK-EMPTY:
// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
// CHECK-EMPTY:
// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
// CHECK-EMPTY:
// CHECK: BinaryOperator {{.*}} '_Bool' '||'
// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
// CHECK:   `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
// CHECK-EMPTY:
// CHECK: IntegerLiteral {{.*}} 'int' 0
// CHECK-EMPTY:
// CHECK: IntegerLiteral {{.*}} 'int' 1
// CHECK-EMPTY:
// CHECK-EMPTY:

Note how the position of the two IntegerLiteral statements isn't as expected, and this seems to trip up FileCheck.

fhahn pushed a commit to swiftlang/llvm-project that referenced this pull request Apr 8, 2025
)

This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.

Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.

With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.

This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.

Fixes llvm#84047.

RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
fhahn added a commit to swiftlang/llvm-project that referenced this pull request Apr 8, 2025
[clang] Restrict the use of scalar types in vector builtins (llvm#119423)


rdar://117484572
@damyanp damyanp moved this to Closed in HLSL Support Apr 25, 2025
fhahn pushed a commit to fhahn/llvm-project that referenced this pull request May 8, 2025
)

This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.

Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.

With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.

This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.

Fixes llvm#84047.

RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
fhahn pushed a commit to fhahn/llvm-project that referenced this pull request May 12, 2025
)

This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.

Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.

With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.

This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.

Fixes llvm#84047.

RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

Unexpected codegen for __builtin_elementwise_bitreverse on unsigned char and short
9 participants