Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ecatmur
Copy link
Owner

@ecatmur ecatmur commented Jul 17, 2022

Background: https://quuxplusone.github.io/blog/2020/01/22/expression-list-in-functional-cast/

Implement as warning flag Wfunctional-cast=0-5

0: no warnings
1: warn on functional cast as reinterpret_cast to pointer or reference type
2: warn on functional cast as const_cast, or static_cast plus const_cast
3: warn on functional cast as downcast (this can change the meaning of the program)
4. warn on functional cast as reinterpret_cast to other type (e.g. uintptr_t)
5. warn on functional cast as other static_cast

examples:
1: using P = char*; P(0x12345678)
2: using P = char*; P("test")
3: struct B {} b; struct D : B { D(B&); }; using R = D&; R(b)
4: intptr_t("test")
5: enum class E {}; E(10)

example for #3:

struct B {};
struct D : B { int i = 99; };
using R = D const&;
D f(B const& b) { R r(b); return r; }
D g(B const& b) { return R(b); }

compiles to (gcc 12.1 -std=c++20 -O3):

f(B const&):
        mov     eax, 99
        ret
g(B const&):
        mov     eax, DWORD PTR [rdi]
        ret

Defaults/shortcuts:

  • -Wfunctional-cast -> 4
  • -Wall -> 4
  • -Wextra -> 5

ecatmur and others added 10 commits March 14, 2022 12:16
-initialization

For now, this is a hard error, only in C++23 mode.

Rationale:

Consider:

    #include <cstdint>
    std::uintptr_t f();
    std::uintptr_t f() { return std::uintptr_t("hello"); }

Even though this is clearly unsafe and quite possibly unintended, there is
no combination of flags on any of the major 4 compilers that will induce
them to emit a warning.  Not even '-Wold-style-cast' will help, since this
is not an old-style (i.e. C-style) cast; it is a functional cast (that is
*interpreted* as a C-style cast).

Similarly, https://cplusplus.github.io/LWG/issue3528 is a welcome fix for
an insidious bug.  However, the resolution (requiring is_constructible,
i.e., that direct-initialization be well-formed) goes beyond the
justification for the issue (since there are conversions allowed by
static_cast that are not available to direct-initialization), leaves open
the possibility that there could be code where direct-initialization is
well-formed but has *different* semantics to those selected by cast
notation, and leaves users vulnerable to committing the same error in their
own libraries.

For example, `static_cast<int&&>(declval<int&>())` is valid, but `int&&
t(declval<int&>());` is not; likewise base-to-derived pointer and
reference conversions, enumeration conversions, and so forth.

Instead, we propose deprecating the cast notation interpretation of the
functional notation entirely; that is, we wish to strike the first
sentence of [expr.type.conv]/2.  Per this change, functional notation
will be fully consistent with direct-initialization, except that it will
also be able to construct void prvalues (but only with an initializer of
() or {}).

This patch implements said deprecation as a hard error in C++23 mode,
and fixes compiler, library and testcase code as necessary by replacing
offending functional notation with the weakest possible cast, excepting
that in some testcases it has not been possible to determine this so the
(C-style) cast notation is used.

https://eel.is/c++draft/expr.compound#expr.type.conv-2.sentence-1
0: no warnings
1: warn on functional cast as reinterpret_cast
2: warn on functional cast as const_cast, or static_cast plus const_cast
3: warn on functional cast as downcast (this can change the meaning of the program)
4: warn on functional cast as other static_cast

examples:
1: intptr_t("test")
2: using P = char*; P("test")
3: struct B {} b; struct D : B { D(B&); }; using R = D&; R(b)
4: enum class E {}; E(10)
@ecatmur ecatmur marked this pull request as draft July 17, 2022 21:03
ecatmur pushed a commit that referenced this pull request Jul 17, 2022
Consider

  struct A {
    int x;
    int y = x;
  };

  struct B {
    int x = 0;
    int y = A{x}.y; // #1
  };

where for #1 we end up with

  {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}

that is, two PLACEHOLDER_EXPRs for different types on the same level in
a {}.  This crashes because our CONSTRUCTOR_PLACEHOLDER_BOUNDARY mechanism to
avoid replacing unrelated PLACEHOLDER_EXPRs cannot deal with it.

Here's why we wound up with those PLACEHOLDER_EXPRs: When we're performing
cp_parser_late_parsing_nsdmi for "int y = A{x}.y;" we use finish_compound_literal
on type=A, compound_literal={((struct B *) this)->x}.  When digesting this
initializer, we call get_nsdmi which creates a PLACEHOLDER_EXPR for A -- we don't
have any object to refer to yet.  After digesting, we have

  {.x=((struct B *) this)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}

and since we've created a PLACEHOLDER_EXPR inside it, we marked the whole ctor
CONSTRUCTOR_PLACEHOLDER_BOUNDARY.  f_c_l creates a TARGET_EXPR and returns

  TARGET_EXPR <D.2384, {.x=((struct B *) this)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}>

Then we get to

  B b = {};

and call store_init_value, which digests the {}, which produces

  {.x=NON_LVALUE_EXPR <0>, .y=(TARGET_EXPR <D.2395, {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}>).y}

lookup_placeholder in constexpr won't find an object to replace the
PLACEHOLDER_EXPR for B, because ctx->object will be D.2395 of type A, and we
cannot search outward from D.2395 to find 'b'.

The call to replace_placeholders in store_init_value will not do anything:
we've marked the inner { } CONSTRUCTOR_PLACEHOLDER_BOUNDARY, and it's only
a sub-expression, so replace_placeholders does nothing, so the <P_E struct B>
stays even though now is the perfect time to replace it because we have an
object for it: 'b'.

Later, in cp_gimplify_init_expr the *expr_p is

  D.2395 = {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}

where D.2395 is of type A, but we crash because we hit <P_E struct B>, which
has a different type.

My idea was to replace <P_E struct A> with D.2384 after creating the
TARGET_EXPR because that means we have an object we can refer to.
Then clear CONSTRUCTOR_PLACEHOLDER_BOUNDARY because we no longer have
a PLACEHOLDER_EXPR in the {}.  Then store_init_value will be able to
replace <P_E struct B> with 'b', and we should be good to go.  We must
be careful not to break guaranteed copy elision, so this replacement
happens in digest_nsdmi_init where we can see the whole initializer,
and avoid replacing any placeholders in TARGET_EXPRs used in the context
of initialization/copy elision.  This is achieved via the new function
called potential_prvalue_result_of.

While fixing this problem, I found PR105550, thus the FIXMEs in the
tests.

	PR c++/100252

gcc/cp/ChangeLog:

	* typeck2.cc (potential_prvalue_result_of): New.
	(replace_placeholders_for_class_temp_r): New.
	(digest_nsdmi_init): Call it.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/nsdmi-aggr14.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr15.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr16.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr17.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr18.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr19.C: New test.
ecatmur pushed a commit that referenced this pull request Jul 17, 2022
As explained in r11-4959-gde6f64f9556ae3, the atom cache assumes two
equivalent expressions (according to cp_tree_equal) must use the same
template parameters (according to find_template_parameters).  This
assumption turned out to not hold for TARGET_EXPR, which was addressed
by that commit.

But this assumption apparently doesn't hold for PARM_DECL either:
find_template_parameters walks its DECL_CONTEXT but cp_tree_equal by
default doesn't consider DECL_CONTEXT unless comparing_specializations
is set.  Thus in the first testcase below, the atomic constraints of #1
and #2 are equivalent according to cp_tree_equal, but according to
find_template_parameters the former uses T and the latter uses both T
and U (surprisingly).

We could fix this assumption violation by setting comparing_specializations
in the atom_hasher, which would make cp_tree_equal return false for the
two atoms, but that seems overly pessimistic here.  Ideally the atoms
should continue being considered equivalent and we instead fix
find_template_paremeters to return just T for #2's atom.

To that end this patch makes for_each_template_parm_r stop walking the
DECL_CONTEXT of a PARM_DECL.  This should be safe to do because
tsubst_copy / tsubst_decl only substitutes the TREE_TYPE of a PARM_DECL
and doesn't bother substituting the DECL_CONTEXT, thus the only relevant
template parameters are those used in its type.  any_template_parm_r is
currently responsible for walking its TREE_TYPE, but I suppose it now makes
sense for for_each_template_parm_r to do so instead.

In passing this patch also makes for_each_template_parm_r stop walking
the DECL_CONTEXT of a VAR_/FUNCTION_DECL since doing so after walking
DECL_TI_ARGS is redundant, I think.

I experimented with not walking DECL_CONTEXT for CONST_DECL, but the
second testcase below demonstrates it's necessary to walk it.

	PR c++/105797

gcc/cp/ChangeLog:

	* pt.cc (for_each_template_parm_r) <case FUNCTION_DECL, VAR_DECL>:
	Don't walk DECL_CONTEXT.
	<case PARM_DECL>: Likewise.  Walk TREE_TYPE.
	<case CONST_DECL>: Simplify.
	(any_template_parm_r) <case PARM_DECL>: Don't walk TREE_TYPE.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-decltype4.C: New test.
	* g++.dg/cpp2a/concepts-memfun3.C: New test.
ecatmur pushed a commit that referenced this pull request Jul 17, 2022
(In reply to Uroš Bizjak from comment #1)
> Instruction does not accept memory operand for operand 3:
>
> (define_insn_and_split
> "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint"
>   [(set (match_operand:<ssebytemode> 0 "register_operand" "=Yr,*x,x")
> 	(unspec:<ssebytemode>
> 	  [(match_operand:<ssebytemode> 1 "register_operand" "0,0,x")
> 	   (match_operand:<ssebytemode> 2 "vector_operand" "YrBm,*xBm,xm")
> 	   (subreg:<ssebytemode>
> 	     (lt:VI48_AVX
> 	       (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x")
> 	       (match_operand:VI48_AVX 4 "const0_operand")) 0)]
> 	  UNSPEC_BLENDV))]
>
> The problematic insn is:
>
> (define_insn_and_split "*avx_cmp<mode>3_ltint_not"
>  [(set (match_operand:VI48_AVX  0 "register_operand")
>        (vec_merge:VI48_AVX
> 	 (match_operand:VI48_AVX 1 "vector_operand")
> 	 (match_operand:VI48_AVX 2 "vector_operand")
> 	 (unspec:<avx512fmaskmode>
> 	   [(subreg:VI48_AVX
> 	    (not:<ssebytemode>
> 	      (match_operand:<ssebytemode> 3 "vector_operand")) 0)
> 	    (match_operand:VI48_AVX 4 "const0_operand")
> 	    (match_operand:SI 5 "const_0_to_7_operand")]
> 	    UNSPEC_PCMP)))]
>
> which gets split to the above pattern.
>
> In the preparation statements we have:
>
>   if (!MEM_P (operands[3]))
>     operands[3] = force_reg (<ssebytemode>mode, operands[3]);
>   operands[3] = lowpart_subreg (<MODE>mode, operands[3], <ssebytemode>mode);
>
> Which won't fly when operand 3 is memory operand...
>

gcc/ChangeLog:

	PR target/105953
	* config/i386/sse.md (*avx_cmp<mode>3_ltint_not): Force_reg
	operands[3].

gcc/testsuite/ChangeLog:

	* g++.target/i386/pr105953.C: New test.
ecatmur pushed a commit that referenced this pull request Jul 17, 2022
This patch implements C++23 P2255R2, which adds two new type traits to
detect reference binding to a temporary.  They can be used to detect code
like

  std::tuple<const std::string&> t("meow");

which is incorrect because it always creates a dangling reference, because
the std::string temporary is created inside the selected constructor of
std::tuple, and not outside it.

There are two new compiler builtins, __reference_constructs_from_temporary
and __reference_converts_from_temporary.  The former is used to simulate
direct- and the latter copy-initialization context.  But I had a hard time
finding a test where there's actually a difference.  Under DR 2267, both
of these are invalid:

  struct A { } a;
  struct B { explicit B(const A&); };
  const B &b1{a};
  const B &b2(a);

so I had to peruse [over.match.ref], and eventually realized that the
difference can be seen here:

  struct G {
    operator int(); // #1
    explicit operator int&&(); // #2
  };

int&& r1(G{}); // use #2 (no temporary)
int&& r2 = G{}; // use #1 (a temporary is created to be bound to int&&)

The implementation itself was rather straightforward because we already
have the conv_binds_ref_to_prvalue function.  The main function here is
ref_xes_from_temporary.
I've changed the return type of ref_conv_binds_directly to tristate, because
previously the function didn't distinguish between an invalid conversion and
one that binds to a prvalue.  Since it no longer returns a bool, I removed
the _p suffix.

The patch also adds the relevant class and variable templates to <type_traits>.

	PR c++/104477

gcc/c-family/ChangeLog:

	* c-common.cc (c_common_reswords): Add
	__reference_constructs_from_temporary and
	__reference_converts_from_temporary.
	* c-common.h (enum rid): Add RID_REF_CONSTRUCTS_FROM_TEMPORARY and
	RID_REF_CONVERTS_FROM_TEMPORARY.

gcc/cp/ChangeLog:

	* call.cc (ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.  Add a new bool parameter.  Change
	the return type to tristate.
	* constraint.cc (diagnose_trait_expr): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* cp-tree.h: Include "tristate.h".
	(enum cp_trait_kind): Add CPTK_REF_CONSTRUCTS_FROM_TEMPORARY
	and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.
	(ref_xes_from_temporary): Declare.
	* cxx-pretty-print.cc (pp_cxx_trait_expression): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* method.cc (ref_xes_from_temporary): New.
	* parser.cc (cp_parser_primary_expression): Handle
	RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY.
	(cp_parser_trait_expr): Likewise.
	(warn_for_range_copy): Adjust to call ref_conv_binds_directly.
	* semantics.cc (trait_expr_value): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(finish_trait_expr): Likewise.

libstdc++-v3/ChangeLog:

	* include/std/type_traits (reference_constructs_from_temporary,
	reference_converts_from_temporary): New class templates.
	(reference_constructs_from_temporary_v,
	reference_converts_from_temporary_v): New variable templates.
	(__cpp_lib_reference_from_temporary): Define for C++23.
	* include/std/version (__cpp_lib_reference_from_temporary): Define for
	C++23.
	* testsuite/20_util/variable_templates_for_traits.cc: Test
	reference_constructs_from_temporary_v and
	reference_converts_from_temporary_v.
	* testsuite/20_util/reference_from_temporary/value.cc: New test.
	* testsuite/20_util/reference_from_temporary/value2.cc: New test.
	* testsuite/20_util/reference_from_temporary/version.cc: New test.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/reference_constructs_from_temporary1.C: New test.
	* g++.dg/ext/reference_converts_from_temporary1.C: New test.
@ecatmur ecatmur changed the title Wfunctional cast Add warning options -Wfunctional-cast=n Jul 19, 2022
ecatmur pushed a commit that referenced this pull request Aug 19, 2022
In my previous patches I've been extending our std::move warnings,
but this tweak actually dials it down a little bit.  As reported in
bug 89780, it's questionable to warn about expressions in templates
that were type-dependent, but aren't anymore because we're instantiating
the template.  As in,

  template <typename T>
  Dest withMove() {
    T x;
    return std::move(x);
  }

  template Dest withMove<Dest>(); // #1
  template Dest withMove<Source>(); // #2

Saying that the std::move is pessimizing for #1 is not incorrect, but
it's not useful, because removing the std::move would then pessimize #2.
So the user can't really win.  At the same time, disabling the warning
just because we're in a template would be going too far, I still want to
warn for

  template <typename>
  Dest withMove() {
    Dest x;
    return std::move(x);
  }

because the std::move therein will be pessimizing for any instantiation.

So I'm using the suppress_warning machinery to that effect.
Problem: I had to add a new group to nowarn_spec_t, otherwise
suppressing the -Wpessimizing-move warning would disable a whole bunch
of other warnings, which we really don't want.

	PR c++/89780

gcc/cp/ChangeLog:

	* pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress
	-Wpessimizing-move.
	* typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings
	if they are suppressed.
	(check_return_expr): Disable -Wpessimizing-move when returning
	a dependent expression.

gcc/ChangeLog:

	* diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle
	OPT_Wpessimizing_move and OPT_Wredundant_move.
	* diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning.
	* g++.dg/cpp0x/Wredundant-move2.C: Likewise.
@ecatmur
Copy link
Owner Author

ecatmur commented Sep 14, 2022

improvements (h/t Mathias Stearn):

  • move reinterpret_cast<uintptr> to level 4 (or 4a), reserve level 1 for reinterpret_cast to pointer/reference types
  • check whether it warns on vector<Foo*>().emplace(aBarPtr) (hopefully -isystem doesn't get in the way)

ecatmur pushed a commit that referenced this pull request Oct 9, 2022
To improve compile times, the C++ library could use compiler built-ins
rather than implementing std::is_convertible (and _nothrow) as class
templates.  This patch adds the built-ins.  We already have
__is_constructible and __is_assignable, and the nothrow forms of those.

Microsoft (and clang, for compatibility) also provide an alias called
__is_convertible_to.  I did not add it, but it would be trivial to do
so.

I noticed that our __is_assignable doesn't implement the "Access checks
are performed as if from a context unrelated to either type" requirement,
therefore std::is_assignable / __is_assignable give two different results
here:

  class S {
    operator int();
    friend void g(); // #1
  };

  void
  g ()
  {
    // #1 doesn't matter
    static_assert(std::is_assignable<int&, S>::value, "");
    static_assert(__is_assignable(int&, S), "");
  }

This is not a problem if __is_assignable is not meant to be used by
the users.

This patch doesn't make libstdc++ use the new built-ins, but I had to
rename a class otherwise its name would clash with the new built-in.

	PR c++/106784

gcc/c-family/ChangeLog:

	* c-common.cc (c_common_reswords): Add __is_convertible and
	__is_nothrow_convertible.
	* c-common.h (enum rid): Add RID_IS_CONVERTIBLE and
	RID_IS_NOTHROW_CONVERTIBLE.

gcc/cp/ChangeLog:

	* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONVERTIBLE
	and CPTK_IS_NOTHROW_CONVERTIBLE.
	* cp-objcp-common.cc (names_builtin_p): Handle RID_IS_CONVERTIBLE
	RID_IS_NOTHROW_CONVERTIBLE.
	* cp-tree.h (enum cp_trait_kind): Add CPTK_IS_CONVERTIBLE and
	CPTK_IS_NOTHROW_CONVERTIBLE.
	(is_convertible): Declare.
	(is_nothrow_convertible): Likewise.
	* cxx-pretty-print.cc (pp_cxx_trait_expression): Handle
	CPTK_IS_CONVERTIBLE and CPTK_IS_NOTHROW_CONVERTIBLE.
	* method.cc (is_convertible): New.
	(is_nothrow_convertible): Likewise.
	* parser.cc (cp_parser_primary_expression): Handle RID_IS_CONVERTIBLE
	and RID_IS_NOTHROW_CONVERTIBLE.
	(cp_parser_trait_expr): Likewise.
	* semantics.cc (trait_expr_value): Handle CPTK_IS_CONVERTIBLE and
	CPTK_IS_NOTHROW_CONVERTIBLE.
	(finish_trait_expr): Likewise.

libstdc++-v3/ChangeLog:

	* include/std/type_traits: Rename __is_nothrow_convertible to
	__is_nothrow_convertible_lib.
	* testsuite/20_util/is_nothrow_convertible/value_ext.cc: Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/has-builtin-1.C: Enhance to test __is_convertible and
	__is_nothrow_convertible.
	* g++.dg/ext/is_convertible1.C: New test.
	* g++.dg/ext/is_convertible2.C: New test.
	* g++.dg/ext/is_nothrow_convertible1.C: New test.
	* g++.dg/ext/is_nothrow_convertible2.C: New test.
split out reinterpret_cast to non-pointer non-reference, and insert it at level 4, bumping static_cast to level 5.

1. reinterpret_cast to pointer/reference
2. const_cast
3. downcast
4. reinterpret_cast to e.g. intptr_t
5. static_cast
@ecatmur
Copy link
Owner Author

ecatmur commented Oct 11, 2022

Defaults:

  • -Wfunctional-cast -> 4
  • -Wconversion -> 3
  • -Wall -> 4
  • -Wextra -> 5

@ecatmur
Copy link
Owner Author

ecatmur commented Oct 18, 2022

Next: add named versions of the levels (credit: Arthur O'Dwyer):

-Wfunctional-reinterpret-cast, -Wfunctional-const-cast, -Wfunctional-downcast, -Wfunctional-reference-cast, perhaps. The difference between 1 and 4 is tricky.

And then -Wfunctional-static-cast, -Wfunctional-cast implying all of the above, and -Wdangerous-functional-cast implying 1,2,3 but not 5 and IMO not 4 either.

@ecatmur
Copy link
Owner Author

ecatmur commented Nov 24, 2022

Also ref. clang-tidy google-readability-casting (although that doesn't have fine-grained control to distinguish between less- and more-safe uses); cf. https://reviews.llvm.org/D136156

ecatmur pushed a commit that referenced this pull request Feb 4, 2023
While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, we
noticed that the GCC 9.1 warning is not emitted in all the cases where
it should be.  This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.

We split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1.  This way, patch #1 can be back-ported
to release branches if needed to fix the GCC 9.1 warning issue.

The main idea is to add a new global boolean that indicates whether
we're expanding the start of a function, so that aarch64_layout_arg
can emit warnings for callees as well as callers.  This removes the
need for aarch64_function_arg_boundary to warn (with its incomplete
information).  However, in the first patch there are still cases where
we emit warnings were we should not; this is fixed in patch #2 where
we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.

The fix in aarch64_function_arg_boundary (replacing & with &&) looks
like an oversight of a previous commit in this area which changed
'abi_break' from a boolean to an integer.

We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.

2022-11-28  Christophe Lyon  <[email protected]>
	    Richard Sandiford  <[email protected]>

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix
	comment.
	(aarch64_layout_arg): Factorize warning conditions.
	(aarch64_function_arg_boundary): Fix typo.
	* function.cc (currently_expanding_function_start): New variable.
	(expand_function_start): Handle
	currently_expanding_function_start.
	* function.h (currently_expanding_function_start): Declare.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
	test.
	* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
	test.
	* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning.h: New test.
	* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
	test.
	* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
	test.
	* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning.h: New test.
ecatmur pushed a commit that referenced this pull request Feb 4, 2023
This recognises the patterns of the form:
  while (n & 1) { n >>= 1 }

Unfortunately there are currently two issues relating to this patch.

Firstly, simplify_using_initial_conditions does not recognise that
	(n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0).

This preconditions arise following the loop copy-header pass, and the
assumptions returned by number_of_iterations_exit_assumptions then
prevent final value replacement from using the niter result.

I'm not sure what is the best way to fix this - one approach could be to
modify simplify_using_initial_conditions to handle this sort of case,
but it seems that it basically wants the information that ranger could
give anway, so would something like that be a better option?

The second issue arises in the vectoriser, which is able to determine
that the niter->assumptions are always true.
When building with -march=armv8.4-a+sve -S -O3, we get this codegen:

foo (unsigned int b) {
    int c = 0;

    if (b == 0)
      return PREC;

    while (!(b & (1 << (PREC - 1)))) {
        b <<= 1;
        c++;
    }

    return c;
}

foo:
.LFB0:
        .cfi_startproc
        cmp     w0, 0
        cbz     w0, .L6
        blt     .L7
        lsl     w1, w0, 1
        clz     w2, w1
        cmp     w2, 14
        bls     .L8
        mov     x0, 0
        cntw    x3
        add     w1, w2, 1
        index   z1.s, #0, #1
        whilelo p0.s, wzr, w1
.L4:
        add     x0, x0, x3
        mov     p1.b, p0.b
        mov     z0.d, z1.d
        whilelo p0.s, w0, w1
        incw    z1.s
        b.any   .L4
        add     z0.s, z0.s, #1
        lastb   w0, p1, z0.s
        ret
        .p2align 2,,3
.L8:
        mov     w0, 0
        b       .L3
        .p2align 2,,3
.L13:
        lsl     w1, w1, 1
.L3:
        add     w0, w0, 1
        tbz     w1, gcc-mirror#31, .L13
        ret
        .p2align 2,,3
.L6:
        mov     w0, 32
        ret
        .p2align 2,,3
.L7:
        mov     w0, 0
        ret
        .cfi_endproc

In essence, the vectoriser uses the niter information to determine
exactly how many iterations of the loop it needs to run. It then uses
SVE whilelo instructions to run this number of iterations. The original
loop counter is also vectorised, despite only being used in the final
iteration, and then the final value of this counter is used as the
return value (which is the same as the number of iterations it computed
in the first place).

This vectorisation is obviously bad, and I think it exposes a latent
bug in the vectoriser, rather than being an issue caused by this
specific patch.

gcc/ChangeLog:

	* tree-ssa-loop-niter.cc (number_of_iterations_cltz): New.
	(number_of_iterations_bitcount): Add call to the above.
	(number_of_iterations_exit_assumptions): Add EQ_EXPR case for
	c[lt]z idiom recognition.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/cltz-max.c: New test.
	* gcc.dg/tree-ssa/clz-char.c: New test.
	* gcc.dg/tree-ssa/clz-int.c: New test.
	* gcc.dg/tree-ssa/clz-long-long.c: New test.
	* gcc.dg/tree-ssa/clz-long.c: New test.
	* gcc.dg/tree-ssa/ctz-char.c: New test.
	* gcc.dg/tree-ssa/ctz-int.c: New test.
	* gcc.dg/tree-ssa/ctz-long-long.c: New test.
	* gcc.dg/tree-ssa/ctz-long.c: New test.
ecatmur pushed a commit that referenced this pull request Feb 4, 2023
Here the ahead-of-time overload set pruning in finish_call_expr is
unintentionally returning a CALL_EXPR whose (pruned) callee is wrapped
in an ADDR_EXPR, despite the original callee not being wrapped in an
ADDR_EXPR.  This ends up causing a bogus declaration mismatch error in
the below testcase because the call to min in #1 gets expressed as a
CALL_EXPR of ADDR_EXPR of FUNCTION_DECL, whereas the level-lowered call
to min in #2 gets expressed instead as a CALL_EXPR of FUNCTION_DECL.

This patch fixes this by stripping the spurious ADDR_EXPR appropriately.
Thus the first call to min now also gets expressed as a CALL_EXPR of
FUNCTION_DECL, matching the behavior before r12-6075-g2decd2cabe5a4f.

	PR c++/107461

gcc/cp/ChangeLog:

	* semantics.cc (finish_call_expr): Strip ADDR_EXPR from
	the selected callee during overload set pruning.

gcc/testsuite/ChangeLog:

	* g++.dg/template/call9.C: New test.
ecatmur pushed a commit that referenced this pull request May 17, 2023
Currently on xstormy16 SImode shifts by a single bit require two
instructions, and shifts by other non-zero integer immediate constants
require five instructions.  This patch implements the obvious optimization
that shifts by two bits can be done in four instructions, by using two
single-bit sequences.

Hence, ashift_2 was previously generated as:
        mov r7,r2 | shl r2,#2 | shl r3,#2 | shr r7,gcc-mirror#14 | or r3,r7
        ret
and with this patch we now generate:
        shl r2,#1 | rlc r3,#1 | shl r2,#1 | rlc r3,#1
        ret

2023-04-23  Roger Sayle  <[email protected]>

gcc/ChangeLog
	* config/stormy16/stormy16.cc (xstormy16_output_shift): Implement
	SImode shifts by two by performing a single bit SImode shift twice.

gcc/testsuite/ChangeLog
	* gcc.target/xstormy16/shiftsi.c: New test case.
ecatmur pushed a commit that referenced this pull request May 17, 2023
This patch contains some minor tweak to xstormy16's machine description
most significantly providing a pattern for HImode rotate left by a single
bit that requires only two instructions.

unsigned short foo(unsigned short x)
{
  return (x << 1) | (x >> 15);
}

currently with -O2 generates:
foo:    mov r7,r2
        shr r7,gcc-mirror#15
        shl r2,#1
        or r2,r7
        ret

with this patch, GCC now generates:
foo:	shl r2,#1 | adc r2,#0
        ret

Additionally neghi2 is converted to a define_insn (so that the RTL
optimizers see the negation semantics), and HImode rotations by
8-bits can now be recognized and implemented using swpb.

2023-04-29  Roger Sayle  <[email protected]>

gcc/ChangeLog
	* config/stormy16/stormy16.md (neghi2): Convert from a define_expand
	to a define_insn.
	(*rotatehi_1): New define_insn for efficient 2 insn sequence.
	(*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb.

gcc/testsuite/ChangeLog
	* gcc.target/xstormy16/neghi2.c: New test case.
	* gcc.target/xstormy16/rotatehi-1.c: Likewise.
ecatmur pushed a commit that referenced this pull request May 17, 2023
I noticed that for member class templates of a class template we were
unnecessarily substituting both the template and its type.  Avoiding that
duplication speeds compilation of this silly testcase from ~12s to ~9s on my
laptop.  It's unlikely to make a difference on any real code, but the
simplification is also nice.

We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation
of the template class, but it makes more sense to do that in
tsubst_template_decl anyway.

  #define NC(X)					\
    template <class U> struct X##1;		\
    template <class U> struct X##2;		\
    template <class U> struct X##3;		\
    template <class U> struct X##4;		\
    template <class U> struct X##5;		\
    template <class U> struct X##6;
  #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f)
  #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E)
  template <int I> struct A
  {
    NC3(am)
  };
  template <class...Ts> void sink(Ts...);
  template <int...Is> void g()
  {
    sink(A<Is>()...);
  }
  template <int I> void f()
  {
    g<__integer_pack(I)...>();
  }
  int main()
  {
    f<1000>();
  }

gcc/cp/ChangeLog:

	* pt.cc (instantiate_class_template): Skip the RECORD_TYPE
	of a class template.
	(tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants