Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ZJIT: Split Send into Lookup+Call #13400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

tekknolagi
Copy link
Contributor

@tekknolagi tekknolagi commented May 21, 2025

Split SendWithoutBlock{,Direct} into LookupMethod and CallMethod. From there,
centralize the profile- and type-driven rewrites to one function:
optimize_lookup_method. This determines if we know the call target at compile
time.

Then, we can rewrite CallMethod to do something with a known-constant target.
If we know it at compile-time, we can specialize to CallCFunc or CallIseq
(or more, like a hypothetical GetInstanceVariable, in the future).

Also,

  • Add CmePtr as type alias alias
  • Add CallableMethodEntry to type lattice
  • Make CallData an alias for *const rb_call_data

Fixes Shopify#577

Co-authored-by: Aaron Patterson [email protected]

@tekknolagi tekknolagi changed the title Split Send into Lookup+Call ZJIT: Split Send into Lookup+Call May 21, 2025
@matzbot matzbot requested a review from a team May 21, 2025 14:53
@tekknolagi
Copy link
Contributor Author

I will absolutely be cleaning up the commits in this PR for rebasing

@tekknolagi
Copy link
Contributor Author

Also, I need to figure out some stuff before landing like a codegen implementation of CallMethod (is there a nice C function for this?) and a couple of other TODOs sprinkled throughout the changes

@k0kubun
Copy link
Member

k0kubun commented May 22, 2025

Is it intentional that the ZJIT CI is failing at the moment?

@tekknolagi
Copy link
Contributor Author

Well it's not exactly intentional but I don't have codegen for CallMethod because I can't figure out a way to do it yet :( It's currently crashing/assert failing/...

@tekknolagi
Copy link
Contributor Author

But if you have feedback on the rest of the (admittedly big, kind of disorganized) PR separate from that, I would be interested. The CallMethod codegen is hopefully separate/isolated enough

Copy link

launchable-app bot commented Jun 6, 2025

Tests Failed

✖️no tests failed ✔️61990 tests passed(1 flake)

This helps method lookups get resolved.
* Need to spill receiver and arguments to the VM stack as arguments
* Don't use cpush; cpush always pushes two words
* Save SP
* Align stack
@tekknolagi
Copy link
Contributor Author

Should we just set up a frame and shell out to vm_exec? This PR is getting way too old and I need the HIR stuff to land about a month ago

@XrXr
Copy link
Member

XrXr commented Jul 2, 2025

Should we just set up a frame and shell out to vm_exec?

That sounds as tricky as what you're doing right now, since we still need to switch on the CME type in static code, set up parameters and all that. Basically vm_call0_cme().

@tekknolagi
Copy link
Contributor Author

Augh. We really need these HIR changes to improve the optimizer. @XrXr and @k0kubun what do you think is the fastest way to get this thing merged?

@XrXr
Copy link
Member

XrXr commented Jul 2, 2025

Short of just breaking codegen for now, just debugging all the failures seems like the way to go. We can set up pairing sessions and whatnot. (BTW, CI isn't running now because of merge conflict)

@k0kubun
Copy link
Member

k0kubun commented Jul 2, 2025

I wonder if you could have just split the HIR instruction in one PR and then worked on changing the generated code in another PR. You seem to have introduced NATIVE_STACK_PTR and changed how the stack is used, but is it mandatory to achieve "Split Send into Lookup+Call"? When it's hard to make it work, I would start from small changes.

@tekknolagi
Copy link
Contributor Author

I could split up the HIR in one PR, yes, but then I want to actually generate those HIR instructions. And if I generate the HIR, we have to compile it (in the current set of tests) :/ So in that sense they kind of have to be together

@tekknolagi
Copy link
Contributor Author

Maybe I can have CallMethod behave as thought it were a full Send, including (re-)doing the lookup, for now, and therefore re-use existing codegen for Send

@k0kubun
Copy link
Member

k0kubun commented Jul 2, 2025

Oh, I thought you're splitting both SendWithoutBlock and SendWithoutBlockDirect, but looking at HIR tests, LookupMethod seems to be used only before the dynamic CallMethod. CallIseq and CallCFunc do not have LookupMethod. So the outcome is basically that we're splitting only the dynamic dispatch instruction.

It seems important to optimize CallIseq and CallCFunc further, but I'm not sure if I follow the motivation to split SendWithoutBlock. The method lookup was cached with cd->cc on master, but now you're always calling rb_callable_method_entry because of the split, which could be slower than master. You also seem to be doing some heavy lifting for passing arguments through the C stack instead of the interpreter stack, but the dynamic dispatch is so slow that it may not have a significant impact.

Generally, if you want to optimize a method call, you should generate an optimized method call instruction like CallISeq and CallCFunc (or inline it) instead of optimizing the fallback implementation CallMethod. We should be able to make them as few as what YJIT did, at least.

Maybe I can have CallMethod behave as thought it were a full Send, including (re-)doing the lookup, for now, and therefore re-use existing codegen for Send

Yeah, I think we should do that.

@tekknolagi
Copy link
Contributor Author

The motivation to split SendWithoutBlock is that there are two separate components that happen in a send: a lookup and a call. Since there are N different kinds of calls (though we only support positional and positional with block right now) but more or less only one kind of lookup, we should split those so they can be optimized separately. CallIseq replaces SendWithoutBlockDirect, CallCFunc is a more general (can side-effect/reenter) version of CCall, and then we can start attaching HIR fast-paths to those later in a separate PR.

@k0kubun
Copy link
Member

k0kubun commented Jul 2, 2025

I think we should have a pairing session as Alan suggested 🙂 I have some more details I want to share/discuss with you, but it takes a lot of effort to do that in texts, and it has not been as effective as I wanted for this issue so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ZJIT: Split SendWithoutBlock into LookupMethod+CallMethod
3 participants