Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Missed optimization: inline functions, when operations can be done with smaller bit width #124714

@Explorer09

Description

@Explorer09
#include <stdbool.h>
#include <stdint.h>

static inline uint64_t saturating_sub_u64(uint64_t a, uint64_t b) {
    return a > b ? a - b : 0;
}

uint32_t test1a(uint32_t a, uint32_t b) {
    return a > b ? a - b : 0;
}

uint32_t test1b(uint32_t a, uint32_t b) {
    return (uint64_t)a > (uint64_t)b ? (uint64_t)a - (uint64_t)b : (uint64_t)0;
}

uint32_t test1c(uint32_t a, uint32_t b) {
    return saturating_sub_u64(a, b);
}

Expected result: test1a, test1b and test1c functions transform to same code.

Actual result: test1a and test1b transform to same code, but test1c produces slightly larger code, with unnecessary zero extension operations.

This can be tested in Compiler Explorer.

x86-64 clang 19.1.0 with -Os option produces:

test1b:
        xorl    %eax, %eax
        subl    %esi, %edi
        cmovael %edi, %eax
        retq

test1c:
        movl    %edi, %ecx
        movl    %esi, %edx
        xorl    %eax, %eax
        subq    %rdx, %rcx
        cmovaeq %rcx, %rax
        retq

Related: GCC bug 118679
When I reported the bug in GCC, I have another example that GCC missed the optimization for, but somehow Clang optimized it correctly (see the max_u64 and test2 functions in that bug report). The saturating_sub_u64 case is what Clang missed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions