Progress on toolchain security features

By Jake Edge
November 12, 2024

Over the years, there has been steady progress in adding security features to compilers and other tools to assist with hardening the Linux kernel (and, of course, other programs). In something of a tradition in the toolchains track at the Linux Plumbers Conference, Kees Cook and Qing Zhao have led a session on that progress and further plans; this year, they were joined by Justin Stitt (YouTube video).

Rust

Cook said that he would begin by talking about Rust, rather than sprinkling it throughout the talk, as he has in the past. It seemed easier, he said, to handle all of the Rust information on a single slide (slides). It is important to maintain parity between the security features of the GCC, Clang, and Rust compilers in order to avoid cross-language attacks. To that end, the arm64 software-based shadow call stack for Rust is getting close to being ready to merge into the kernel. Kernel control-flow-integrity (CFI) support for Rust is also in progress; it is forward-edge protection against subverting indirect calls.

There are several things that have not really been looked at yet, including zeroing registers used in calls and making structure layout randomization work with Rust. Cook said that the __counted_by() attribute, which is used to provide bounds information for flexible arrays, needs some investigation to see how it interacts with Rust code. He thinks that the information provided by __counted_by() will already be represented in the Rust bindings, so there will not be a need for any explicit handling on the Rust side, but that should be confirmed.

While Rust has native arithmetic-overflow handling, there is still not parity with the behavior of the C code. When an overflow occurs, the undefined behavior sanitizer (UBSAN) gives a different result than Rust does; "it would be nice to have one result".

Counted by

Moving on, the "big news from the last year" was all of the work done for __counted_by(), which identifies a structure member that is tracking the size of a flexible array in the structure. Once the support for the attribute landed in GCC and Clang, annotations needed to be added to the kernel, which has been done for 391 structures. People are adding more of these all the time, he said, and he hopes that it is becoming the default for any new flexible arrays.

Something that was "kind of a footnote" in last year's update (YouTube video) has been fixed in both GCC and Clang. Due to some odd differences between the C language specification and the compilers, unions could contain flexible arrays, as long as they specified [0] for the size, because of GCC and Clang extensions to handle that case. "Sort of accidentally", flexible arrays are not supposed to be allowed in unions at all, according to the specification, though. Meanwhile, removing the zero from the size specification (leaving just []), in order to modernize the code, has been part of the flexible-array cleanup work, but doing so in unions would not compile because of the specification. Now, modern flexible-array declarations will be accepted by the extensions, which "will simplify some really horrible hacks" in the kernel to work around the problem.

Next up was a slide showing the progress on getting flags for setting the stack-protector-guard location added to GCC and Clang for five different architectures; this will allow having different stack canaries for each process. He had planned to talk about the lack of progress for Clang on the RISC-V and PowerPC architectures over the last four years, as all of the other architecture-compiler combinations have been completed, but he had "accidentally nerd-sniped" Keith Packard into fixing those holes, so that work is now in progress. "My goal is to not have to show this slide next year", he said with a chuckle.

Control-flow integrity

There is work needed on the forward-edge CFI support, since no progress has been made on that over the last year. The support for hardware CFI protection is basically done, and has been for a while, but the more fine-grained, per-function-prototype software-based CFI protection needs to be added to GCC. Zhao said that she had talked with some RISC-V developers at GNU Tools Cauldron who are working on GCC support, starting from the arm64 patch set that Cook had mentioned; some of that work may be applicable beyond RISC-V. Packard suggested that the arm32 pointer authentication code (PAC) extension could be used for CFI protection on that architecture; Cook was not opposed, but arm32 is not a major area of focus for him.

Similarly, backward-edge CFI has stalled since last year. The hardware support exists for both x86 and arm64. Getting x86 hardware support for shadow call stacks to be used internally by the kernel looks difficult, though Peter Zijlstra said that the Intel Flexible Return and Event Delivery (FRED) feature might provide a mechanism for that. So far, Cook said, there has been no work on creating a software-based hash-checking scheme for backward-edge CFI, similar to KCFI that is used for forward-edge protection.

Pointers and bounds

There is work needed in order to add __counted_by() for general pointer values; both GCC and Clang have started working on support. It is somewhat related to the -fbounds-safety proposal from Apple that he would go into more detail about later in the session.

Zhao came forward to talk about the work being done on pointer bounds in GCC. There are two main cases, pointers inside structures, where the bound is contained in another structure field, and pointers passed as arguments, where the bound is also passed as an argument. The second case is already handled in GCC by the access attribute, so GCC developers will focus on the first case. She has discussed adding __counted_by() for pointers in structures with the GCC maintainers, who are amenable to that approach. Nick Alcock said that access is not currently used to generate warnings for exceeding the bounds, however; it is, instead, "a promise to the optimizer" that the bounds will not be exceeded, which is somewhat different. Zhao agreed that a different attribute might be needed for checking the bounds on pointer arguments.

Cook returned to talk about a problem that has been found as the __counted_by() annotations have been added to the kernel. In many cases, the structures holding a flexible array are allocated at run time, but there is a need to ensure that the field being used for bounds checking gets initialized at the same time that the allocation is done. He does not like to have manually repeated information that must be kept in sync in two places in the kernel, so he is working with the GCC and Clang developers to get another to extension to __counted_by() so that the allocators can set the counter without knowing which field, if any, is being used for bounds.

The __builtin_counted_by_ref() intrinsic function will return a pointer to the field used in __counted_by() or NULL if there is no bounds-checking annotation. That allows wrappers to be written for allocators that will initialize the count if there is one at the time the allocation is done. So the __counted_by() annotations can be made "without also having to go and check all of the allocation sites to make sure that the counter has been set before we are accessing the array". That means a wrapper can be used for all structures that have flexible arrays and when __counted_by() gets added to the structure, "it magically gets the size added as well"; both GCC and Clang are working on adding the feature.

Zhao said that the GCC developers have been discussing the return type for __builtin_counted_by_ref(), in particular for the case where it returns NULL because there is no __counted_by() information. Originally the idea was to return a NULL size_t pointer, but it was decided that a (void *)NULL would ease the use of the feature, since some counting fields may not be size_t. That has implications for the example wrapper that he showed, Cook said, because the void pointer cannot be used to set the count when there is a __counted_by() attribute. So "a little bit more trickery" needs to be added, which has been done, and works; it "generates better code but it's a little ugly to read".

The Apple -fbounds-safety proposal that Cook mentioned earlier has "a huge number of things covering all aspects of gaps in C's bounds safety", he said, including annotations for arrays that are counted by elements or bytes, as well as those terminated by some constant (e.g. NUL-terminated strings), and more. The proposal came up when the request was made to the Clang developers for a way to annotate flexible arrays. It is much more ambitious than just that; Cook thinks the kernel and GCC will want to adopt more of those annotations, but he has been focusing on the low-hanging fruit.

There is also work needed to clarify the -Warray-bounds warnings issued by GCC, so that the flag can be used in kernel builds. The kernel "unintentionally constructs code that the compiler sees as obviously incorrect", but the warnings do not really help clarify what the problem is. Zhao said that due to GCC value tracking (which is done for optimization purposes), the array-bounds checking can find real problems in the code, but the current diagnostics make them look like false positives. She has done some work to make the warnings more understandable for developers.

Arithmetic overflow

There is a question about what to do for unexpected arithmetic wraparound (or overflow) in the kernel, Stitt said. If you filter CVEs based on overflow and wraparound problems, there are multiple entries for the kernel; "if we could turn on the overflow sanitizers, then we're increasing protection". But the kernel essentially makes signed integer overflow into defined behavior with the -fno-strict-overflow flag, so UBSAN does not flag overflows. In Clang 19, the signed-integer-overflow sanitizer has been changed so that it works with -fwrapv (which is enabled by -fno-strict-overflow), though Zijlstra called that a bug. Stitt said that one could argue that it is a bug, since overflow is not an undefined behavior for the kernel, but the change was made with an eye toward detecting unexpected overflows, which often cause problems.

The kernel has "what I'll call 'overflow-dependent code patterns'", such as code that is explicitly checking for overflow (e.g. a + b < a) in order to handle it in some fashion. The sanitizer will complain about that, even though it is not a real problem, so there are "idiom exclusions" that have been added to Clang 19 to ignore certain types of code. For example, -1UL "will always overflow", so it would cause a complaint, making the sanitizer useless for the kernel; the Clang overflow pattern exclusions will tell the sanitizer to step out of the way for three specific patterns.

Zijlstra said that he would like to see a qualifier akin to const or volatile that could be attached to variables that should not wrap. Stitt said he agreed but that the compiler developers are moving away from that solution. Cook noted that he and Zijlstra have a fundamental disagreement on how to get coverage for unexpected overflow; he thinks "we have to mark the expected places where we're wrapping so that all the rest will get caught", though Zijlstra disagrees with that approach.

One way to proceed that works around the (somewhat conflicting) objections from compiler and kernel developers would be to treat certain types, size_t for example, differently in the sanitizer. Zijlstra did not like turning certain types into "magic" types; having a qualifier would allow specifying that behavior more widely. Cook would like to try to make some progress given the current constraints and noted that the type filtering is not mutually exclusive with adding a qualifier later once the usefulness of the feature can be shown. Stitt pointed out that the Clang feature (possibly slated for Clang 20) is not hard-coded in the compiler; it will, instead, be some kind of configuration for the build system. So it will not appear at the source-code level, but will still be controlled by the kernel community.

Track organizer José Marchesi noted that the unexpected overflow items were listed as "needed" for GCC, so he wondered if they had all been added to the project's Bugzilla. Cook said that those items had not been added due to "some existing fundamental disagreements about the word 'undefined', because 'undefined' has a very well defined meaning", he said to chuckles. Some of that needs to be resolved before progress can be made for GCC. He would also like to get some proof of the feature's usefulness, which will also help smooth the path.

Stitt said that GCC still needs an unsigned-integer-overflow sanitizer. Zhao said that she has raised the overflow/sanitizer problems in the GCC community, so the developers are aware of the issues. The major problem is with the idiom exclusions; she talked about it at Cauldron and there was a lot of resistance to the feature. She agrees: "I don't like it, I think it's a hack". She has some ideas for other ways to approach the problem, but needs to think some more on them. Cook closed the session by noting that the purpose of these sessions is to get the discussion going; he does not claim that the right solutions have been found, but hopes to make some progress on addressing the known problem areas.

[ I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Vienna for the Linux Plumbers Conference. ]

Index entries for this article
Kernel	Security/Kernel hardening
Security	Linux kernel/Hardening
Conference	Linux Plumbers Conference/2024

to post comments

Idiom exclusion is really so important

Posted Nov 13, 2024 8:06 UTC (Wed) by error27 (subscriber, #8346) [Link] (21 responses)

Whenever there is a compiler warning, hopefully, the warnings are mostly useful but there are always going to be some which are not useful. The idiom exclusion feature cuts down on the non-useful warnings.

For example, it's mostly useful to have a warning about unsigned comparisons with zero. But when the code looks has both and upper and lower bound like this: "if (x < 0 || x >= 10) {", then the warning is not useful. The code works 100% as intended.

There are different strategies you could use to silence the warning. In the unsigned example, you could just delete the "x < 0" comparison. But that's work which provides no benefit and arguably makes the code slightly less readable.

For the integer overflows, presumably you would add a annotation like "if (wrap_ok(x + y) < x)". And actually I don't hate that too much... But probably other people do, and it's also a lot of work to do it retroactively over a giant project like the kernel. It could be avoided if we use idiom exclusions.

I understand why the GCC developers don't like idiom exclusions but compiler warnings are already a mostly random collection of warnings and they already miss 95% of bugs. If the code looks fine, then it's okay to not print a warning.

Idiom exclusion is really so important

Posted Nov 13, 2024 9:31 UTC (Wed) by pm215 (subscriber, #98099) [Link]

Coverity has a particularly irritating version of that "unsigned comparison with zero in a bounds check" warning, where it will do so when the type being bounds checked is an enum. Unless I have got confused (not unlikely where C standard minutiae are concerned :-)) enums with no negative values may be signed or unsigned as the compiler chooses, so for portability the greater-than-or-equal-to-zero check is necessary, but Coverity gripes about it because the implementation happens to have used an unsigned type for the enum...

Idiom exclusion is really so important

Posted Nov 13, 2024 9:44 UTC (Wed) by magfr (subscriber, #16052) [Link] (19 responses)

I have a hand time understanding why

x+y<x

is considered easier to read and better than

INT_MAX-y<=x

Sure, the former is type agnostic and naïvely generates smaller code but I would expect both to end up checking the flag result of x+y.

Idiom exclusion is really so important

Posted Nov 13, 2024 10:31 UTC (Wed) by intelfx (subscriber, #130118) [Link] (7 responses)

> I have a hand time understanding why
>
> x+y<x
>
> is considered easier to read and better than
>
> INT_MAX-y<=x

It is considered better in the eyes of the language lawyers. Which, apparently, is enough to trump any kind of readability concerns.

I'd ask a different question, though. With the amount of compiler extensions already part of "kernel C" (some of which are even in this article), I'm having a hard time understanding reasons for not adding a `__builtin_wraps(expr)` kind of thing that explicitly evaluates `expr` under `-fwrapv` rules and tests that for overflow.

Surely an `if (__builtin_wraps(a + b))` is more readable than both `if (a + b < a)` and `if (INT_MAX - b <= a)`?

Idiom exclusion is really so important

Posted Nov 13, 2024 10:36 UTC (Wed) by pm215 (subscriber, #98099) [Link] (5 responses)

The compilers have had that for a while: see __builtin_add_overflow_p in https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Built...

Idiom exclusion is really so important

Posted Nov 13, 2024 10:40 UTC (Wed) by intelfx (subscriber, #130118) [Link] (4 responses)

Nope. Sorry, these builtins are a just a planetary abomination. The identity of the arithmetic operation is completely erased, making it absolutely in-obvious that it's in fact an arithmetic operation rather than a random function call.

It needs to be a `__some_builtin(expr)`, where `expr` is a **regular** arithmetic expression using **regular** + and - symbols.

Idiom exclusion is really so important

Posted Nov 13, 2024 10:50 UTC (Wed) by pm215 (subscriber, #98099) [Link] (2 responses)

I certainly wouldn't use the naked builtin directly, but that's true of most GCC builtins. But I'll absolutely take "looks like a function call and has 'add' as a word rather than '+' as a symbol, but is extremely obviously doing an overflow check" over "has '+' in the expression but is ridiculously opaque about what it is actually doing", especially when the thing it is doing is important for correctness and often for security.

No objections if you want to get nicer facilities added to the compiler, but in the interim I'll use the ones we have. And if I did have to work with a compiler where I had to use "x + y < x" I would wrap it in a function so I could give it a clearer name...

Idiom exclusion is really so important

Posted Nov 13, 2024 11:09 UTC (Wed) by intelfx (subscriber, #130118) [Link] (1 responses)

> But I'll absolutely take "looks like a function call and has 'add' as a word rather than '+' as a symbol, but is extremely obviously doing an overflow check" over "has '+' in the expression but is ridiculously opaque about what it is actually doing"

Who was saying anything about being opaque?

I just proposed a `__builtin_overflows(expr)` thing which makes it equally obvious that it is a test for overflow. **And** that it is an arithmetic expression.

All those builtins disguising arithmetic operations as function calls — no. Just no. The arithmetics are infix for a reason. This is C, not some kind of a Lisp or a reverse Forth.

Idiom exclusion is really so important

Posted Nov 13, 2024 11:20 UTC (Wed) by pm215 (subscriber, #98099) [Link]

You seemed to me to be proposing that until this suggestion of yours is implemented in compilers we should continue to open-code "if (x + y < x)". That open-coding is what I am describing as opaque.

Your idea is clearly nicer than the existing builtins, but we don't have it yet and in the best case won't have it widely available for years, so the question of what is most legible and least bug prone given current compiler facilities still matters, I think.

Idiom exclusion is really so important

Posted Nov 13, 2024 11:01 UTC (Wed) by error27 (subscriber, #8346) [Link]

Yeah. That would be nice. My idea is that it would be scope based. The compiler would trap any integer overflows and goto the label. I've left off \ characters and probably made typos but you get the idea:

#define saturate(math...) (__attribute__(overflow overflowed_label) {
size_t result;
bool overflow = false;
result = (math);
goto done;

overflowed_label:
overflow = true;
done:
overflow ? SIZE_MAX : result;
})

Idiom exclusion is really so important

Posted Nov 13, 2024 10:37 UTC (Wed) by intelfx (subscriber, #130118) [Link]

Ah, apologies, I apparently misread that. My brain thought that you just missed a "not" somewhere in there because I didn't realize someone could be asking _that_ question.

Well, `if (a + b < a)` preserves the identity of participating arithmetic operations. When you see a `if (a + b < a)`, it is immediately obvious that you are dealing with a sum of `a` and `b`, and if you are familiar with wrapping arithmetic, it's also almost immediately parsable as checking for overflow.

On the other hand, `if (INT_MAX - a <= b)` does not preserve neither the identity of the operation, nor the identity of operands. You need significant brain time to mentally carry over the operands to one side, invert the signs and realize that we are talking about `a + b`, not some weird arithmetic expression with unrelated `a`, `b` and a constant.

Personally, it's a no-brainer which one is clearer.

Idiom exclusion is really so important

Posted Nov 13, 2024 10:32 UTC (Wed) by pm215 (subscriber, #98099) [Link] (2 responses)

Personally I think something like add_would_overflow(x, y) is easier to read than both and less susceptible to "looks almost like the idiom but isn't quite right" errors. (And you can implement it via the compiler builtins.)

Idiom exclusion is really so important

Posted Nov 13, 2024 10:54 UTC (Wed) by adobriyan (subscriber, #30858) [Link] (1 responses)

Half of the time you want to both add and bail. Another half is to only check because operation in question will be done much later
and passing the result is cumbersome or not necessary.

The _right_ way is to not invent new idioms (which every project will do differently) but to use __builtin_add_overflow().

T len;
if (__builtin_add_overflow(a, b, &len)) { return -E; };
[use len]

or (because clang doesn't do BAO_p)

if (__builtin_add_overflow(a, b, &(int[1]){})) { return -E; };

Those who fear that BAO does addition differently and will make people break glass and cut hands may want to lobby for a warning if BAO is done on different types (especially with both different signedness _and_ length).

BAO is cool for checking if result fits into some other type:

if (__builtin_add_overflow(a, 0, &(T[1]){})

Just use BAO.

Idiom exclusion is really so important

Posted Nov 27, 2024 23:58 UTC (Wed) by jwakely (subscriber, #60262) [Link]

>The _right_ way is to not invent new idioms (which every project will do differently) but to use __builtin_add_overflow().

It's in C23 as chk_add so instead of inventing your own, write chk_add as a simple wrapper around bao and then everybody agrees on a standard API.

N.B. chk_add has the output pointer first, not last. Otherwise it's the same.

Idiom exclusion is really so important

Posted Nov 14, 2024 16:13 UTC (Thu) by anton (subscriber, #25547) [Link] (7 responses)

I would expect both to end up checking the flag result of x+y.

Looking at <[email protected]>, gcc-10 generates the "naïve" code (on AMD64 9 bytes for "b<b=1" and 14 bytes for "b==LONG_MIN"), while gcc-12 generates the longer code from both idioms. Neither checks the flag result. You have to explicitly use __builtin_sub_overflow(b,1,&c) to get that, and then both versions generate a 6-byte sequence using add instead of the 5-byte sequence using dec. Whatever goes on in the minds of the gcc maintainers, there seem to be things on their agenda that are more important than generating short and fast code.

Idiom exclusion is really so important

Posted Nov 14, 2024 16:51 UTC (Thu) by pizza (subscriber, #46) [Link] (6 responses)

> Whatever goes on in the minds of the gcc maintainers, there seem to be things on their agenda that are more important than generating short and fast code.

As the saying goes, "patches welcome"

Idiom exclusion is really so important

Posted Nov 15, 2024 8:53 UTC (Fri) by anton (subscriber, #25547) [Link] (5 responses)

Are you speaking for the gcc maintainers?

If not, what makes you think so? What makes you think that they would welcome a patch that reverts the change that pessimises the "b<b-1" case rather than exercising one of the justification mechanisms for their deeds, as they have done in other cases? Or simply ignoring the issue as in Bug 93811 where I provided working code (not a patch, though).

If yes, why should I do for free what most of you are paid to do? Especially given your lack of fulfilling your part of the deal, as discussed above.

Idiom exclusion is really so important

Posted Nov 15, 2024 12:15 UTC (Fri) by pizza (subscriber, #46) [Link] (4 responses)

> Are you speaking for the gcc maintainers?

I'm speaking as a maintainer of completely unrelated free software.

> If not, what makes you think so? What makes you think that they would welcome a patch that reverts the change that pessimises the "b<b-1" case rather than exercising one of the justification mechanisms for their deeds, as they have done in other cases? Or simply ignoring the issue as in Bug 93811 where I provided working code (not a patch, though).

What makes you think that complaining (and disparaging the GCC maintainers) on LWN is remotely productive or useful?

Everyone involved with any free software project has far, far more things on their to-do lists than they can possibly accomplish; consequently they prioritize the things they care about.

If the GCC maintainers' priorities differ from yours, it is incumbent upon *you* do step up and contribute in some way (eg code/patches, funding, or advocacy). Either way, disparaging the maintainers (and/or other active contributors) isn't going to accomplish anything productive.

> If yes, why should I do for free what most of you are paid to do? Especially given your lack of fulfilling your part of the deal, as discussed above.

There is no "deal" here, or at least not one with *you*.

...The ones that pay the maintainers (if they are being paid at all -- most aren't!) get to decide the priorities.

You don't like that? Patches welcome. Heck, you can even keep that patch entirely to yourself if the GCC maintainers don't like it.

"Patches welcome" means "Shut up!"

Posted Nov 15, 2024 13:07 UTC (Fri) by anton (subscriber, #25547) [Link] (3 responses)

What makes you think that complaining (and disparaging the GCC maintainers) on LWN is remotely productive or useful?

My comment was productive or useful by correcting the expectation of magfr.

Everyone involved with any free software project has far, far more things on their to-do lists than they can possibly accomplish; consequently they prioritize the things they care about.

Certainly. And, as I wrote, some gcc maintainer found the time to recognize the "b<b-1" idiom, and found the time to transform it into longer code even when gcc is invoked with -Os. So what makes you think that submitting a patch that reverts that change would achieve anything remotely productive or useful?

If the GCC maintainers' priorities differ from yours, it is incumbent upon *you* do step up and contribute in some way (eg code/patches, funding, or advocacy).

You think that I should provide a patch for a project maintained by people who will ignore the patch because their priorities differ from mine? Why should that be remotely productive or useful?

You think I should provide funding or advocacy for a project where the maintainers' priorities differ from mine? I can see that the gcc maintainers would like that, but why should I?

There is no "deal" here, or at least not one with *you*.

In that case, what did you want to say with "patches welcome"? As used by you it apparently means "I don't like what you are writing and I wish you would shut up, so I suggest a wild-goose chase". Your last paragraph makes that even clearer.

"Patches welcome" means "Shut up!"

Posted Nov 15, 2024 14:35 UTC (Fri) by pizza (subscriber, #46) [Link] (2 responses)

> Certainly. And, as I wrote, some gcc maintainer found the time to recognize the "b<b-1" idiom, and found the time to transform it into longer code even when gcc is invoked with -Os. So what makes you think that submitting a patch that reverts that change would achieve anything remotely productive or useful?

Because there may be very good intentional reasons for that change in GCC? Or maybe this is an unintentional side effect of some other fix/improvement and can itself be fixed incrementally with sufficient elbow grease?

(I can say from experience that micro-optimizations like that rarely make a meaningful difference in code size/performance. Unless you're at the point of truly counting/shaving bytes, in which case you are going to have a real budget to throw at this)

> You think that I should provide a patch for a project maintained by people who will ignore the patch because their priorities differ from mine? Why should that be remotely productive or useful?

You would create the patch because it makes things that much better for *you*. Sharing/pushing it upstream is entirely optional.

> You think I should provide funding or advocacy for a project where the maintainers' priorities differ from mine? I can see that the gcc maintainers would like that, but why should I?

I meant advocate for *your* use case/problems, not advocate for the project in general. (And "advocate" means "persuade the audience", not "crap all over them")

But in a more general sense, very little out there is ever fully aligned with one's personal priorities. Welcome to life.

> In that case, what did you want to say with "patches welcome"? As used by you it apparently means "I don't like what you are writing and I wish you would shut up, so I suggest a wild-goose chase". Your last paragraph makes that even clearer.

...You're the one expecting other folks to care about and perform (potentially very specialized) work on your behalf because *someone else* is paying them to work on other use cases. That's some pretty serious entitlement.

"Patches welcome" means "Shut up!"

Posted Nov 15, 2024 18:41 UTC (Fri) by anton (subscriber, #25547) [Link] (1 responses)

I can say from experience that micro-optimizations like that rarely make a meaningful difference in code size/performance.

In that case, why have they made the change at all? Or many other changes? The usual claim is that each transformation by itself rarely makes any meaningful difference, but they add a whole lot of them, and together they usually result in smaller/faster code. And it's certain that you cannot generate smaller code by compiling an individual idiom into larger code.

You would create the patch because it makes things that much better for *you*. Sharing/pushing it upstream is entirely optional.

I would not know how. I distribute free software, i.e., as source code. How would a patch that only I use make things better for me, much less "much better"? Because I can then answer bug reports with "works for me"? That's not how I prefer to work.

You're the one expecting other folks to care about and perform (potentially very specialized) work on your behalf because *someone else* is paying them to work on other use cases.

What makes you think so? Concerning gcc, I have mostly given up on the idea that they feel bound by the social contract where users find and report bugs and gcc maintainers fix them, so no, I don't expect them to do anything, and therefore I have mostly stopped reporting gcc bugs.

Concerning people who pay them, I doubt that Intel pays for gcc development with the intention that they implement transformations that make the code longer when using -Os.

"Patches welcome" means "Shut up!"

Posted Nov 16, 2024 10:46 UTC (Sat) by joib (subscriber, #8541) [Link]

As a nowadays inactive maintainer of a small corner of gcc, and speaking only for myself, largely I agree with what "pizza" has said in this subthread. GCC is a large project with a lot of stakeholders, and nobody gets exactly what they want. Those who put in more, be it in terms of funding developers, or individual contributors putting in elbow grease to scratch their own itches, get out more than those who contribute less.

For some specific points you made:

> In that case, why have they made the change at all? Or many other changes? The usual claim is that each transformation by itself rarely makes any meaningful difference, but they add a whole lot of them, and together they usually result in smaller/faster code. And it's certain that you cannot generate smaller code by compiling an individual idiom into larger code.

Nobody intentionally pessimizes code, unless it's to fix some correctness issue. There's obviously no conspiracy against people with passionate opinions about generating optimal code for "b < b - 1". Without looking into specifics, my guess would be that this is an unintentional side-effect of some other work, and the effect it caused was insignificant in size/performance on the benchmark suites that are used to track the development of the compiler, mainly SPECcpu, but also other similar application benchmarks. Those are ultimately what people tend to care about, not some particular micro-optimization (unless that particular micro-optimization affects the size and/or performance of some larger application).

That is not to say micro-optimizations aren't useful; they certainly are used for analyzing the impact of some particular optimization passes etc. But again, ultimately what matters to users is the size/performance of complete applications. In some cases it can mean a code transformation that pessimizes some particular aspect (like generating optimal code for "b < b - 1") but makes some other code faster, for an ultimately positive result on SPEC, gets merged.

> Concerning gcc, I have mostly given up on the idea that they feel bound by the social contract where users find and report bugs and gcc maintainers fix them, so no, I don't expect them to do anything, and therefore I have mostly stopped reporting gcc bugs.

Like so many other projects, GCC has many more bug reports than it has developers able to fix them. Thus bugs have to be triaged, and if it so happens that your favourite bug isn't considered important enough to get the attention of the very limited developer time, then unfortunately it means it will languish. If you're unable to convince the developers that your bug is very important and needs attention, then other ways around that conundrum are to fix it yourself or pay someone to do it, hence 'patches welcome'. To the extent that can be construed as "Shut up!", it's pointing out that complaining about poor code generation on comp.arch, lwn, or some other forum, fun as it may be as a way to pass your free time, is unlikely to cause other people to step up and fix your issue.

> Concerning people who pay them, I doubt that Intel pays for gcc development with the intention that they implement transformations that make the code longer when using -Os.

Very true, but it's also likely true that out of the near infinite ways the compiler can be improved, Intel doesn't consider generating optimal code for "b < b - 1" to be particularly high on the list of priorities.

No only somewhat different

Posted Nov 16, 2024 0:05 UTC (Sat) by SLi (subscriber, #53131) [Link]

> Nick Alcock said that access is not currently used to generate warnings for exceeding the bounds, however; it is, instead, "a promise to the optimizer" that the bounds will not be exceeded, which is somewhat different.

That doesn't sound like "somewhat different", but the exact opposite. The first asks the compiler to check; the second makes a promise to a compiler, allowing it to silently do rubbish if the promise doesn't hold. The first one turns (some) erroneous or undefined behavior into a warning. The second one turns correct-at-the-language-level behavior into UB.

The first one is good for diagnostics and checking. The second one is the opposite; it is good for optimization when you know something cannot happen and want the compiler to use that information as much as it can.