-
Notifications
You must be signed in to change notification settings - Fork 13.8k
match clang's va_arg
assembly on arm targets
#144549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some changes occurred in compiler/rustc_codegen_ssa |
…-multiple-abis-arm, r=RalfJung,davidtwco c-variadic: multiple ABIs in the same program for arm similar to rust-lang#144379, but for arm, requested in rust-lang#144066. Quoting rust-lang/reference#1946 (comment) > `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers. However for c-variadic functions, `aapcs` behaves the same as `C`: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing > A variadic function is always marshaled as for the base standard. https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants > This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return. --- I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: rust-lang#144549) try-job: armhf-gnu r? `@RalfJung`
Rollup merge of #144541 - folkertdev:c-variadic-same-program-multiple-abis-arm, r=RalfJung,davidtwco c-variadic: multiple ABIs in the same program for arm similar to #144379, but for arm, requested in #144066. Quoting rust-lang/reference#1946 (comment) > `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers. However for c-variadic functions, `aapcs` behaves the same as `C`: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing > A variadic function is always marshaled as for the base standard. https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants > This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return. --- I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: #144549) try-job: armhf-gnu r? `@RalfJung`
…-abis-arm, r=RalfJung,davidtwco c-variadic: multiple ABIs in the same program for arm similar to rust-lang/rust#144379, but for arm, requested in rust-lang/rust#144066. Quoting rust-lang/reference#1946 (comment) > `"aapcs"` specifically refers to the soft-float ABI where floating-point values are passed in integer registers. However for c-variadic functions, `aapcs` behaves the same as `C`: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#65parameter-passing > A variadic function is always marshaled as for the base standard. https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#7the-standard-variants > This section applies only to non-variadic functions. For a variadic function the base standard is always used both for argument passing and result return. --- I also noticed that rustc currently emit more instructions than clang for c-variadic functions on arm, see https://godbolt.org/z/hMce9rnTh. I'll fix that separately. (edit: rust-lang/rust#144549) try-job: armhf-gnu r? `@RalfJung`
explicitly end the lifetime of `va_list` tracking issue: #44930 split out from: #144549 The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in #144549. I also included a little drive-by improvement to not cast pointers to integers and back again. r? codegen
e64dda6
to
b37bd2a
Compare
va_arg
assembly on arm targetsva_arg
assembly on arm targets
fc1a674
to
fc58e08
Compare
fc58e08
to
fa9e1f9
Compare
Jubilee is busy, so r? codegen |
explicitly end the lifetime of `va_list` tracking issue: rust-lang/rust#44930 split out from: rust-lang/rust#144549 The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in rust-lang/rust#144549. I also included a little drive-by improvement to not cast pointers to integers and back again. r? codegen
explicitly end the lifetime of `va_list` tracking issue: rust-lang/rust#44930 split out from: rust-lang/rust#144549 The `va_list` is created in the compiler itself when the variable argument list `...` is desugared, and hence the lifetime end is not inserted automatically. The value can't outlive the function in which it was created, so it is correct to end the lifetime here. Ending the lifetime explicitly also appears to give slightly better codegen in rust-lang/rust#144549. I also included a little drive-by improvement to not cast pointers to integers and back again. r? codegen
I appreciate your thorough PR description. Thanks ❤️ @bors r+ |
match clang's `va_arg` assembly on arm targets tracking issue: rust-lang#44930 For this example ```rust #![feature(c_variadic)] #[unsafe(no_mangle)] unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 { let b = args.arg::<f64>(); let c = args.arg::<f64>(); a + b + c } ``` We currently generate (via llvm): ```asm variadic: sub sp, sp, rust-lang#12 stmib sp, {r2, r3} vmov d0, r0, r1 add r0, sp, rust-lang#4 vldr d1, [sp, rust-lang#4] add r0, r0, rust-lang#15 bic r0, r0, rust-lang#7 vadd.f64 d0, d0, d1 add r1, r0, rust-lang#8 str r1, [sp] vldr d1, [r0] vadd.f64 d0, d0, d1 vmov r0, r1, d0 add sp, sp, rust-lang#12 bx lr ``` LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our own `emit_ptr_va_arg` saves 3 instructions. Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In rust-lang#146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register. So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates:: ```asm variadic: sub sp, sp, rust-lang#12 stmib sp, {r2, r3} vmov d16, r0, r1 vldr d17, [sp, rust-lang#4] vadd.f64 d16, d16, d17 vldr d17, [sp, rust-lang#12] vadd.f64 d16, d16, d17 vmov r0, r1, d16 add sp, sp, rust-lang#12 bx lr ``` The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844). r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here)
match clang's `va_arg` assembly on arm targets try-job: arm-android try-job: armhf-gnu
@rustbot ready Android uses different instructions (it looks like it's softfloat?), so it gets ignored now. It is technically possible to make this a //@ add-core-stubs
//@ assembly-output: emit-asm
//@ compile-flags: -Copt-level=3
//@ only-arm
//@ ignore-android
//@ ignore-thumb
#![feature(no_core, lang_items, intrinsics)]
#![no_core]
#![crate_type = "lib"]
#![feature(c_variadic)]
extern crate minicore;
use minicore::*;
#[lang = "va_list"]
struct VaList<'a> {
ptr: *mut u8,
_marker: PhantomData<&'a u8>,
}
impl VaList<'_> {
unsafe fn arg<T: VaArgSafe>(&mut self) -> T {
va_arg(self)
}
}
trait VaArgSafe {}
impl VaArgSafe for f64 {}
#[rustc_intrinsic]
pub unsafe fn va_arg<T: VaArgSafe>(ap: &mut VaList<'_>) -> T;
#[rustc_intrinsic]
pub const fn fadd_algebraic<T: Copy>(x: T, y: T) -> T;
// Check that the assembly that rustc generates matches what clang emits.
#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
// CHECK-LABEL: variadic
// CHECK: sub sp, sp
// CHECK: vldr
// CHECK: vadd.f64
// CHECK: vldr
// CHECK: vadd.f64
let b = args.arg::<f64>();
let c = args.arg::<f64>();
fadd_algebraic(fadd_algebraic(a, b), c)
// CHECK: add sp, sp
} |
Can't this test use integers instead? That should work fine on softfloat targets. |
Using integers doesn't really help, android must be using some older arm version than even https://godbolt.org/z/v6bsdGssq sum_two:
sub sp, sp, #16
stmib sp, {r1, r2, r3}
add r0, sp, #4
ldr r2, [r0, #4]!
add r0, r2, r1
add sp, sp, #16
bx lr versus sum_two:
sub sp, sp, #12
push {r11, lr}
mov r11, sp
sub sp, sp, #4
add r0, r11, #8
stm r0, {r1, r2, r3}
add r0, r11, #8
ldr r2, [r0, #4]!
add r0, r2, r1
mov sp, r11
pop {r11, lr}
add sp, sp, #12
bx lr |
@bors r+ rollup=iffy |
match clang's `va_arg` assembly on arm targets tracking issue: #44930 For this example ```rust #![feature(c_variadic)] #[unsafe(no_mangle)] unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 { let b = args.arg::<f64>(); let c = args.arg::<f64>(); a + b + c } ``` We currently generate (via llvm): ```asm variadic: sub sp, sp, #12 stmib sp, {r2, r3} vmov d0, r0, r1 add r0, sp, #4 vldr d1, [sp, #4] add r0, r0, #15 bic r0, r0, #7 vadd.f64 d0, d0, d1 add r1, r0, #8 str r1, [sp] vldr d1, [r0] vadd.f64 d0, d0, d1 vmov r0, r1, d0 add sp, sp, #12 bx lr ``` LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our own `emit_ptr_va_arg` saves 3 instructions. Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register. So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates:: ```asm variadic: sub sp, sp, #12 stmib sp, {r2, r3} vmov d16, r0, r1 vldr d17, [sp, #4] vadd.f64 d16, d16, d17 vldr d17, [sp, #12] vadd.f64 d16, d16, d17 vmov r0, r1, d16 add sp, sp, #12 bx lr ``` The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844). r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here) try-job: armhf-gnu
💔 Test failed - checks-actions |
A job failed! Check out the build log: (web) (plain enhanced) (plain) Click to see the possible cause of the failure (guessed by this bot)
|
@bors retry |
match clang's `va_arg` assembly on arm targets tracking issue: rust-lang#44930 For this example ```rust #![feature(c_variadic)] #[unsafe(no_mangle)] unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 { let b = args.arg::<f64>(); let c = args.arg::<f64>(); a + b + c } ``` We currently generate (via llvm): ```asm variadic: sub sp, sp, rust-lang#12 stmib sp, {r2, r3} vmov d0, r0, r1 add r0, sp, rust-lang#4 vldr d1, [sp, rust-lang#4] add r0, r0, rust-lang#15 bic r0, r0, rust-lang#7 vadd.f64 d0, d0, d1 add r1, r0, rust-lang#8 str r1, [sp] vldr d1, [r0] vadd.f64 d0, d0, d1 vmov r0, r1, d0 add sp, sp, rust-lang#12 bx lr ``` LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our own `emit_ptr_va_arg` saves 3 instructions. Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In rust-lang#146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register. So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates:: ```asm variadic: sub sp, sp, rust-lang#12 stmib sp, {r2, r3} vmov d16, r0, r1 vldr d17, [sp, rust-lang#4] vadd.f64 d16, d16, d17 vldr d17, [sp, rust-lang#12] vadd.f64 d16, d16, d17 vmov r0, r1, d16 add sp, sp, rust-lang#12 bx lr ``` The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844). r? `@workingjubilee` (I can re-roll if your queue is too full, but you do seem like the right person here) try-job: armhf-gnu
Rollup of 16 pull requests Successful merges: - #144549 (match clang's `va_arg` assembly on arm targets) - #145660 (initial implementation of the darwin_objc unstable feature) - #145895 (thread parking: fix docs and examples) - #146308 (support integer literals in `${concat()}`) - #146323 (check before test for hardware capabilites in bits 32~63 of usize) - #146332 (tidy: make behavior of extra-checks more uniform) - #146374 (Update `browser-ui-test` version to `0.22.2`) - #146413 (Improve suggestion in case a bare URL is surrounded by brackets) - #146426 (Bump miow to 0.60.1) - #146432 (Implement `Socket::take_error` for Hermit) - #146433 (rwlock tests: fix miri macos test regression) - #146435 (Change the default value of `gcc.download-ci-gcc` to `true`) - #146439 (fix cfg for poison test macro) - #146448 ([rustdoc] Correctly handle literal search on paths) - #146449 (Fix `libgccjit` symlink when we build GCC locally) - #146455 (test: remove an outdated normalization for rustc versions) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 15 pull requests Successful merges: - #144549 (match clang's `va_arg` assembly on arm targets) - #145895 (thread parking: fix docs and examples) - #146308 (support integer literals in `${concat()}`) - #146323 (check before test for hardware capabilites in bits 32~63 of usize) - #146332 (tidy: make behavior of extra-checks more uniform) - #146374 (Update `browser-ui-test` version to `0.22.2`) - #146413 (Improve suggestion in case a bare URL is surrounded by brackets) - #146426 (Bump miow to 0.60.1) - #146432 (Implement `Socket::take_error` for Hermit) - #146433 (rwlock tests: fix miri macos test regression) - #146435 (Change the default value of `gcc.download-ci-gcc` to `true`) - #146439 (fix cfg for poison test macro) - #146448 ([rustdoc] Correctly handle literal search on paths) - #146449 (Fix `libgccjit` symlink when we build GCC locally) - #146455 (test: remove an outdated normalization for rustc versions) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #144549 - folkertdev:va-arg-arm, r=saethlin match clang's `va_arg` assembly on arm targets tracking issue: #44930 For this example ```rust #![feature(c_variadic)] #[unsafe(no_mangle)] unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 { let b = args.arg::<f64>(); let c = args.arg::<f64>(); a + b + c } ``` We currently generate (via llvm): ```asm variadic: sub sp, sp, #12 stmib sp, {r2, r3} vmov d0, r0, r1 add r0, sp, #4 vldr d1, [sp, #4] add r0, r0, #15 bic r0, r0, #7 vadd.f64 d0, d0, d1 add r1, r0, #8 str r1, [sp] vldr d1, [r0] vadd.f64 d0, d0, d1 vmov r0, r1, d0 add sp, sp, #12 bx lr ``` LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of `va_arg` is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our own `emit_ptr_va_arg` saves 3 instructions. Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the `va_list`. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register. So, the combination of `emit_ptr_va_arg` with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates:: ```asm variadic: sub sp, sp, #12 stmib sp, {r2, r3} vmov d16, r0, r1 vldr d17, [sp, #4] vadd.f64 d16, d16, d17 vldr d17, [sp, #12] vadd.f64 d16, d16, d17 vmov r0, r1, d16 add sp, sp, #12 bx lr ``` The arguments to `emit_ptr_va_arg` are based on [the clang implementation](https://github.com/llvm/llvm-project/blob/03dc2a41f3d9a500e47b513de5c5008c06860d65/clang/lib/CodeGen/Targets/ARM.cpp#L798-L844). r? ``@workingjubilee`` (I can re-roll if your queue is too full, but you do seem like the right person here) try-job: armhf-gnu
Rollup of 15 pull requests Successful merges: - rust-lang/rust#144549 (match clang's `va_arg` assembly on arm targets) - rust-lang/rust#145895 (thread parking: fix docs and examples) - rust-lang/rust#146308 (support integer literals in `${concat()}`) - rust-lang/rust#146323 (check before test for hardware capabilites in bits 32~63 of usize) - rust-lang/rust#146332 (tidy: make behavior of extra-checks more uniform) - rust-lang/rust#146374 (Update `browser-ui-test` version to `0.22.2`) - rust-lang/rust#146413 (Improve suggestion in case a bare URL is surrounded by brackets) - rust-lang/rust#146426 (Bump miow to 0.60.1) - rust-lang/rust#146432 (Implement `Socket::take_error` for Hermit) - rust-lang/rust#146433 (rwlock tests: fix miri macos test regression) - rust-lang/rust#146435 (Change the default value of `gcc.download-ci-gcc` to `true`) - rust-lang/rust#146439 (fix cfg for poison test macro) - rust-lang/rust#146448 ([rustdoc] Correctly handle literal search on paths) - rust-lang/rust#146449 (Fix `libgccjit` symlink when we build GCC locally) - rust-lang/rust#146455 (test: remove an outdated normalization for rustc versions) r? `@ghost` `@rustbot` modify labels: rollup
tracking issue: #44930
For this example
We currently generate (via llvm):
LLVM is not doing a good job. In fact, it's well-known that LLVM's implementation of
va_arg
is kind of bad, and we implement it ourselves (based on clang) for many targets already. For arm, our ownemit_ptr_va_arg
saves 3 instructions.Next, it turns out it's important for LLVM to explicitly start and end the lifetime of the
va_list
. In #146059 I already end the lifetime, but when looking at this again, I noticed that it is important to also start it, see https://godbolt.org/z/EGqvKTTsK: failing to explicitly start the lifetime uses an extra register.So, the combination of
emit_ptr_va_arg
with starting/ending the lifetime makes rustc emit exactly the instructions that clang generates::The arguments to
emit_ptr_va_arg
are based on the clang implementation.r? @workingjubilee (I can re-roll if your queue is too full, but you do seem like the right person here)
try-job: armhf-gnu