Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

folkertdev
Copy link
Contributor

@folkertdev folkertdev commented Jul 27, 2025

tracking issue: #44930

For this example

#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}

We currently generate (via llvm):

variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, #4
    vldr    d1, [sp, #4]
    add     r0, r0, #15
    bic     r0, r0, #7
    vadd.f64        d0, d0, d1
    add     r1, r0, #8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, #12
    bx      lr

LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of va_arg is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our own emit_ptr_va_arg saves 3 instructions.

Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the va_list. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.

So, the combination of emit_ptr_va_arg with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::

variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d16, r0, r1
    vldr    d17, [sp, #4]
    vadd.f64        d16, d16, d17
    vldr    d17, [sp, #12]
    vadd.f64        d16, d16, d17
    vmov    r0, r1, d16
    add     sp, sp, #12
    bx      lr

The arguments to emit_ptr_va_arg are based on the clang implementation.

r? @workingjubilee (I can re-roll if your queue is too full, but you do seem like the right person here)

try-job: armhf-gnu

@rustbot
Copy link
Collaborator

rustbot commented Jul 27, 2025

workingjubilee is currently at their maximum review capacity.
They may take a while to respond.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 27, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 27, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@folkertdev folkertdev added the F-c_variadic `#![feature(c_variadic)]` label Jul 27, 2025
jhpratt added a commit to jhpratt/rust that referenced this pull request Aug 21, 2025
…-multiple-abis-arm, r=RalfJung,davidtwco

c-variadic: multiple ABIs in the same program for arm

similar to rust-lang#144379, but for arm, requested in rust-lang#144066.

Quoting rust-lang/reference#1946 (comment)

> `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers.

However for c-variadic functions, `aapcs` behaves the same as `C`:

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing

> A variadic function is always marshaled as for the base standard.

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants

> This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return.

---

I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: rust-lang#144549)

try-job: armhf-gnu
r? `@RalfJung`
rust-timer added a commit that referenced this pull request Aug 22, 2025
Rollup merge of #144541 - folkertdev:c-variadic-same-program-multiple-abis-arm, r=RalfJung,davidtwco

c-variadic: multiple ABIs in the same program for arm

similar to #144379, but for arm, requested in #144066.

Quoting rust-lang/reference#1946 (comment)

> `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers.

However for c-variadic functions, `aapcs` behaves the same as `C`:

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing

> A variadic function is always marshaled as for the base standard.

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants

> This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return.

---

I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: #144549)

try-job: armhf-gnu
r? `@RalfJung`
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Aug 22, 2025
…-abis-arm, r=RalfJung,davidtwco

c-variadic: multiple ABIs in the same program for arm

similar to rust-lang/rust#144379, but for arm, requested in rust-lang/rust#144066.

Quoting rust-lang/reference#1946 (comment)

> `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers.

However for c-variadic functions, `aapcs` behaves the same as `C`:

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing

> A variadic function is always marshaled as for the base standard.

https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants

> This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return.

---

I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: rust-lang/rust#144549)

try-job: armhf-gnu
r? `@RalfJung`
bors added a commit that referenced this pull request Sep 2, 2025
explicitly end the lifetime of `va_list`

tracking issue: #44930
split out from: #144549

The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in #144549.

I also included a little drive-by improvement to not cast pointers to integers and back again.

r? codegen
@folkertdev folkertdev changed the title improve va_arg assembly on arm targets match clang's va_arg assembly on arm targets Sep 2, 2025
@folkertdev folkertdev force-pushed the va-arg-arm branch 2 times, most recently from fc1a674 to fc58e08 Compare September 2, 2025 22:16
@folkertdev
Copy link
Contributor Author

Jubilee is busy, so

r? codegen

@rustbot rustbot assigned saethlin and unassigned workingjubilee Sep 2, 2025
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 3, 2025
explicitly end the lifetime of `va_list`

tracking issue: rust-lang/rust#44930
split out from: rust-lang/rust#144549

The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in rust-lang/rust#144549.

I also included a little drive-by improvement to not cast pointers to integers and back again.

r? codegen
github-actions bot pushed a commit to rust-lang/compiler-builtins that referenced this pull request Sep 4, 2025
explicitly end the lifetime of `va_list`

tracking issue: rust-lang/rust#44930
split out from: rust-lang/rust#144549

The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in rust-lang/rust#144549.

I also included a little drive-by improvement to not cast pointers to integers and back again.

r? codegen
@saethlin
Copy link
Member

saethlin commented Sep 8, 2025

I appreciate your thorough PR description. Thanks ❤️

@bors r+

@bors
Copy link
Collaborator

bors commented Sep 8, 2025

📌 Commit fa9e1f9 has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 8, 2025
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Sep 8, 2025
match clang's `va_arg` assembly on arm targets

tracking issue: rust-lang#44930

For this example

```rust
#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}
```

We currently generate (via llvm):

```asm
variadic:
    sub     sp, sp, rust-lang#12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, rust-lang#4
    vldr    d1, [sp, rust-lang#4]
    add     r0, r0, rust-lang#15
    bic     r0, r0, rust-lang#7
    vadd.f64        d0, d0, d1
    add     r1, r0, rust-lang#8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, rust-lang#12
    bx      lr
```

LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm,  our own `emit_ptr_va_arg` saves 3 instructions.

Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In rust-lang#146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.

So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::

```asm
variadic:
    sub     sp, sp, rust-lang#12
    stmib   sp, {r2, r3}
    vmov    d16, r0, r1
    vldr    d17, [sp, rust-lang#4]
    vadd.f64        d16, d16, d17
    vldr    d17, [sp, rust-lang#12]
    vadd.f64        d16, d16, d17
    vmov    r0, r1, d16
    add     sp, sp, rust-lang#12
    bx      lr
```

The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844).

r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here)
rust-bors bot added a commit that referenced this pull request Sep 8, 2025
match clang's `va_arg` assembly on arm targets

try-job: arm-android
try-job: armhf-gnu
@rust-bors
Copy link

rust-bors bot commented Sep 8, 2025

☀️ Try build successful (CI)
Build commit: c3920c0 (c3920c014ad1d112e73c7b175abe2fc91c026ac1, parent: beeb8e3af54295ba494c250e84ecda4c2c5d85ff)

@folkertdev
Copy link
Contributor Author

@rustbot ready

Android uses different instructions (it looks like it's softfloat?), so it gets ignored now.

It is technically possible to make this a minicore test, but it would leak a bunch of implementation details, so I'm not sure that's better. Here is what that would look like:

//@ add-core-stubs
//@ assembly-output: emit-asm
//@ compile-flags: -Copt-level=3
//@ only-arm
//@ ignore-android
//@ ignore-thumb
#![feature(no_core, lang_items, intrinsics)]
#![no_core]
#![crate_type = "lib"]
#![feature(c_variadic)]

extern crate minicore;
use minicore::*;

#[lang = "va_list"]
struct VaList<'a> {
    ptr: *mut u8,
    _marker: PhantomData<&'a u8>,
}

impl VaList<'_> {
    unsafe fn arg<T: VaArgSafe>(&mut self) -> T {
        va_arg(self)
    }
}

trait VaArgSafe {}

impl VaArgSafe for f64 {}

#[rustc_intrinsic]
pub unsafe fn va_arg<T: VaArgSafe>(ap: &mut VaList<'_>) -> T;

#[rustc_intrinsic]
pub const fn fadd_algebraic<T: Copy>(x: T, y: T) -> T;

// Check that the assembly that rustc generates matches what clang emits.

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    // CHECK-LABEL: variadic
    // CHECK: sub sp, sp

    // CHECK: vldr
    // CHECK: vadd.f64
    // CHECK: vldr
    // CHECK: vadd.f64
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();
    fadd_algebraic(fadd_algebraic(a, b), c)

    // CHECK: add sp, sp
}

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 8, 2025
@saethlin
Copy link
Member

Can't this test use integers instead? That should work fine on softfloat targets.

@folkertdev
Copy link
Contributor Author

Using integers doesn't really help, android must be using some older arm version than even arm-unknown-linux-gnueabi, or otherwise restrict instruction selection. In any case, just testing for standard arm should be sufficient, it tests that we emit the right thing there, and the implementation is otherwise equivalent to the one in clang, so it's a reasonable assumption that we'll generate good code for android and thumb as well.

https://godbolt.org/z/v6bsdGssq

sum_two:
        sub     sp, sp, #16
        stmib   sp, {r1, r2, r3}
        add     r0, sp, #4
        ldr     r2, [r0, #4]!
        add     r0, r2, r1
        add     sp, sp, #16
        bx      lr

versus

sum_two:
        sub     sp, sp, #12
        push    {r11, lr}
        mov     r11, sp
        sub     sp, sp, #4
        add     r0, r11, #8
        stm     r0, {r1, r2, r3}
        add     r0, r11, #8
        ldr     r2, [r0, #4]!
        add     r0, r2, r1
        mov     sp, r11
        pop     {r11, lr}
        add     sp, sp, #12
        bx      lr

@saethlin
Copy link
Member

@bors r+ rollup=iffy

@bors
Copy link
Collaborator

bors commented Sep 12, 2025

📌 Commit 94fbb21 has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 12, 2025
@bors
Copy link
Collaborator

bors commented Sep 12, 2025

⌛ Testing commit 94fbb21 with merge 967830a...

bors added a commit that referenced this pull request Sep 12, 2025
match clang's `va_arg` assembly on arm targets

tracking issue: #44930

For this example

```rust
#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}
```

We currently generate (via llvm):

```asm
variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, #4
    vldr    d1, [sp, #4]
    add     r0, r0, #15
    bic     r0, r0, #7
    vadd.f64        d0, d0, d1
    add     r1, r0, #8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, #12
    bx      lr
```

LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm,  our own `emit_ptr_va_arg` saves 3 instructions.

Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.

So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::

```asm
variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d16, r0, r1
    vldr    d17, [sp, #4]
    vadd.f64        d16, d16, d17
    vldr    d17, [sp, #12]
    vadd.f64        d16, d16, d17
    vmov    r0, r1, d16
    add     sp, sp, #12
    bx      lr
```

The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844).

r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here)

try-job: armhf-gnu
@bors
Copy link
Collaborator

bors commented Sep 12, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 12, 2025
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    Updating crates.io index
warning: spurious network error (3 tries remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)
warning: spurious network error (3 tries remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)
warning: spurious network error (2 tries remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)
warning: spurious network error (1 try remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)
error: failed to get `askama_parser` as a dependency of package `askama_derive v0.14.0`
    ... which satisfies dependency `askama_derive = "=0.14.0"` (locked to 0.14.0) of package `askama v0.14.0`
    ... which satisfies dependency `askama = "^0.14"` (locked to 0.14.0) of package `citool v0.1.0 (D:\a\rust\rust\src\ci\citool)`

Caused by:
  download of as/ka/askama_parser failed

Caused by:
  failed to download from `https://index.crates.io/as/ka/askama_parser`

Caused by:
  [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)
##[error]Process completed with exit code 101.
Post job cleanup.

@saethlin
Copy link
Member

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 12, 2025
Zalathar added a commit to Zalathar/rust that referenced this pull request Sep 12, 2025
match clang's `va_arg` assembly on arm targets

tracking issue: rust-lang#44930

For this example

```rust
#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}
```

We currently generate (via llvm):

```asm
variadic:
    sub     sp, sp, rust-lang#12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, rust-lang#4
    vldr    d1, [sp, rust-lang#4]
    add     r0, r0, rust-lang#15
    bic     r0, r0, rust-lang#7
    vadd.f64        d0, d0, d1
    add     r1, r0, rust-lang#8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, rust-lang#12
    bx      lr
```

LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm,  our own `emit_ptr_va_arg` saves 3 instructions.

Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In rust-lang#146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.

So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::

```asm
variadic:
    sub     sp, sp, rust-lang#12
    stmib   sp, {r2, r3}
    vmov    d16, r0, r1
    vldr    d17, [sp, rust-lang#4]
    vadd.f64        d16, d16, d17
    vldr    d17, [sp, rust-lang#12]
    vadd.f64        d16, d16, d17
    vmov    r0, r1, d16
    add     sp, sp, rust-lang#12
    bx      lr
```

The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844).

r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here)

try-job: armhf-gnu
bors added a commit that referenced this pull request Sep 12, 2025
Rollup of 16 pull requests

Successful merges:

 - #144549 (match clang's `va_arg` assembly on arm targets)
 - #145660 (initial implementation of the darwin_objc unstable feature)
 - #145895 (thread parking: fix docs and examples)
 - #146308 (support integer literals in `${concat()}`)
 - #146323 (check before test for hardware capabilites in bits 32~63 of usize)
 - #146332 (tidy: make behavior of extra-checks more uniform)
 - #146374 (Update `browser-ui-test` version to `0.22.2`)
 - #146413 (Improve suggestion in case a bare URL is surrounded by brackets)
 - #146426 (Bump miow to 0.60.1)
 - #146432 (Implement `Socket::take_error` for Hermit)
 - #146433 (rwlock tests: fix miri macos test regression)
 - #146435 (Change the default value of `gcc.download-ci-gcc` to `true`)
 - #146439 (fix cfg for poison test macro)
 - #146448 ([rustdoc] Correctly handle literal search on paths)
 - #146449 (Fix `libgccjit` symlink when we build GCC locally)
 - #146455 (test: remove an outdated normalization for rustc versions)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit that referenced this pull request Sep 12, 2025
Rollup of 15 pull requests

Successful merges:

 - #144549 (match clang's `va_arg` assembly on arm targets)
 - #145895 (thread parking: fix docs and examples)
 - #146308 (support integer literals in `${concat()}`)
 - #146323 (check before test for hardware capabilites in bits 32~63 of usize)
 - #146332 (tidy: make behavior of extra-checks more uniform)
 - #146374 (Update `browser-ui-test` version to `0.22.2`)
 - #146413 (Improve suggestion in case a bare URL is surrounded by brackets)
 - #146426 (Bump miow to 0.60.1)
 - #146432 (Implement `Socket::take_error` for Hermit)
 - #146433 (rwlock tests: fix miri macos test regression)
 - #146435 (Change the default value of `gcc.download-ci-gcc` to `true`)
 - #146439 (fix cfg for poison test macro)
 - #146448 ([rustdoc] Correctly handle literal search on paths)
 - #146449 (Fix `libgccjit` symlink when we build GCC locally)
 - #146455 (test: remove an outdated normalization for rustc versions)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 48d6841 into rust-lang:master Sep 12, 2025
11 of 12 checks passed
@rustbot rustbot added this to the 1.91.0 milestone Sep 12, 2025
rust-timer added a commit that referenced this pull request Sep 12, 2025
Rollup merge of #144549 - folkertdev:va-arg-arm, r=saethlin

match clang's `va_arg` assembly on arm targets

tracking issue: #44930

For this example

```rust
#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}
```

We currently generate (via llvm):

```asm
variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, #4
    vldr    d1, [sp, #4]
    add     r0, r0, #15
    bic     r0, r0, #7
    vadd.f64        d0, d0, d1
    add     r1, r0, #8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, #12
    bx      lr
```

LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm,  our own `emit_ptr_va_arg` saves 3 instructions.

Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.

So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::

```asm
variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d16, r0, r1
    vldr    d17, [sp, #4]
    vadd.f64        d16, d16, d17
    vldr    d17, [sp, #12]
    vadd.f64        d16, d16, d17
    vmov    r0, r1, d16
    add     sp, sp, #12
    bx      lr
```

The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844).

r? ``@workingjubilee`` (I can re-roll if your queue is too full, but you do seem like the right person here)

try-job: armhf-gnu
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 13, 2025
Rollup of 15 pull requests

Successful merges:

 - rust-lang/rust#144549 (match clang's `va_arg` assembly on arm targets)
 - rust-lang/rust#145895 (thread parking: fix docs and examples)
 - rust-lang/rust#146308 (support integer literals in `${concat()}`)
 - rust-lang/rust#146323 (check before test for hardware capabilites in bits 32~63 of usize)
 - rust-lang/rust#146332 (tidy: make behavior of extra-checks more uniform)
 - rust-lang/rust#146374 (Update `browser-ui-test` version to `0.22.2`)
 - rust-lang/rust#146413 (Improve suggestion in case a bare URL is surrounded by brackets)
 - rust-lang/rust#146426 (Bump miow to 0.60.1)
 - rust-lang/rust#146432 (Implement `Socket::take_error` for Hermit)
 - rust-lang/rust#146433 (rwlock tests: fix miri macos test regression)
 - rust-lang/rust#146435 (Change the default value of `gcc.download-ci-gcc` to `true`)
 - rust-lang/rust#146439 (fix cfg for poison test macro)
 - rust-lang/rust#146448 ([rustdoc] Correctly handle literal search on paths)
 - rust-lang/rust#146449 (Fix `libgccjit` symlink when we build GCC locally)
 - rust-lang/rust#146455 (test: remove an outdated normalization for rustc versions)

r? `@ghost`
`@rustbot` modify labels: rollup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-c_variadic `#![feature(c_variadic)]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants