Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@tenderlove
Copy link
Member

merge_three_reg_mov tries to combine the output register of certain instructions (add, sub, etc) with following register copy instructions (mov).

For example:

add out, a, b
mov c, out

Becomes:

add c, a, b

out and c are combined, but only if out's live range ends at the mov instruction and the mov's output is a register. The tests on add's input are not necessary because we only care about the output.

@matzbot matzbot requested a review from a team November 14, 2025 21:12
`merge_three_reg_mov` tries to combine the output register of certain
instructions (add, sub, etc) with following register copy instructions
(mov).

For example:

```
add out, a, b
mov c, out
```

Becomes:

```
add c, a, b
```

`out` and `c` are combined, but only if out's live range ends at the
`mov` instruction and the mov's output is a register.  The tests on
add's input are not necessary because we only care about the output.

Co-Authored-By: Aiden Fox Ivey <[email protected]>
Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Please add a test that previously didn't optimize, though.

@tenderlove
Copy link
Member Author

Makes sense. Please add a test that previously didn't optimize, though.

Do we not have tests for that already?

@tenderlove
Copy link
Member Author

Ah, nevermind. I think understand. We need a test with add reg, mem, mem or something?

@XrXr
Copy link
Member

XrXr commented Nov 14, 2025

Yeah.. On second look I think these checks are load-bearing. Edit: 3 nah, it's fine.

Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This possibly admits more cases. Previously when right was an immediate it did not attempt to merge. So would be nice to have a test for those situations.

@tenderlove
Copy link
Member Author

This possibly admits more cases. Previously when right was an immediate it did not attempt to merge. So would be nice to have a test for those situations.

I might need some help writing a test like that. It looks like our existing code knew that add reg1, reg2, imm could get the optimization and would sometimes only ever pass the left reg in.

Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This possibly admits more cases. Previously when right was an immediate it did not attempt to merge. So would be nice to have a test for those situations.

And the problem with admitting those cases is that what reg 31 refers to changes depends on whether it's sub reg, reg, reg versus sub reg, reg, immediate

With this patch when you have:

Sub ZeroReg(regno=31), 0xf => out
Mov c, out

it would turn it into Sub c, SP(regno=31), 0xf, changing the meaning of the code.

https://developer.arm.com/documentation/ddi0596/2020-12/Base-Instructions/SUB--immediate---Subtract--immediate--

We should add tests/comment for this. Sorry, I had a hunch these checks are actually important but didn't document that enough.

@tenderlove
Copy link
Member Author

Sorry, I'm still not following 😢

Sub ZeroReg(regno=31), 0xf => out
Mov c, out

Is this not the same as:

Sub out, ZeroReg(regno=31), 0xf
Mov c, out

I don't understand how the meaning would be changed.

@tenderlove
Copy link
Member Author

Never mind, I get it. Sub ZeroReg(regno=31), 0xf => out is LIR. sub with an immediate on arm64 can't take the zero register (it would be interpreted as SP rather than XZR).

I guess we do need a test for that, but isn't the existing code susceptible to the same problem? Is ZeroReg not an Opnd::Reg?

@XrXr
Copy link
Member

XrXr commented Nov 15, 2025

Sorry, my explanation was bad.
LIR Add/Sub is actually adds/subs on A64, it sets flags so you can branch on overflow.
Patched, we can now have:

Add out, native_sp, #0x400
Mov native_sp, out
Jo handle_overflow

turn into

Add native_sp, native_sp, #0x400
Jo handle_overflow

But adds is ADDS <Xd>, <Xn|SP>, #<imm>{, <shift>}, so simply putting Xd=regno=31 discards the sum. If we instead use ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>} to have the destination go to native_sp, the flags aren't set. So the meaning ends up changed either way. There is no ADDS <Xd|SP>, <Xn|SP>, #<imm> form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants