Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Implement block_given? call as optimized instruction #8170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 27, 2024

Conversation

headius
Copy link
Member

@headius headius commented Mar 27, 2024

This PR reworks fcalls to block_given? as a custom instruction based on BlockGivenInstr used by the lighter-weight defined?(yield) form.

Instead of forcing a caller frame (so that the caller's received block can be accessed downstack), the new logic checks if the target method is the core version and uses defined?(yield) logic in that case. When the method is not from JRuby core, then we dispatch normally... but without the caller frame requirement, since only the core block_given? is allowed to have that access. All other call paths leading to block_given? continue to use the deoptimized logic, so we are not introducing any new incompatibility.

This makes the performance of block_given?-calling methods equivalent to those that use defined?(yield) and avoids the framing cost in all situations that fcall block_given?.

Performance is shown through the added benchmark:

[] jruby $ (chruby jruby-9.4.5.0 ; jruby bench/bench_block_given.rb)
Warming up --------------------------------------
               block   417.419k i/100ms
            no block   609.148k i/100ms
       defined block   714.207k i/100ms
    defined no block     1.711M i/100ms
Calculating -------------------------------------
               block      4.199M (± 0.5%) i/s -     21.288M in   5.069484s
            no block      6.514M (± 0.3%) i/s -     32.894M in   5.049702s
       defined block      7.291M (± 0.5%) i/s -     37.139M in   5.093778s
    defined no block     17.039M (± 0.4%) i/s -     85.566M in   5.021866s
[] jruby $ jruby bench/bench_block_given.rb                         
Warming up --------------------------------------
               block   713.008k i/100ms
            no block     1.818M i/100ms
       defined block   752.260k i/100ms
    defined no block     1.709M i/100ms
Calculating -------------------------------------
               block      7.670M (± 0.4%) i/s -     38.502M in   5.020222s
            no block     18.201M (± 0.4%) i/s -     92.738M in   5.095348s
       defined block      7.569M (± 0.6%) i/s -     38.365M in   5.068610s
    defined no block     17.051M (± 1.3%) i/s -     85.467M in   5.013348s

headius added 3 commits March 26, 2024 23:20
In order to optimize calls to block_given? that end up in the
expected core method, this patch alters compilation of fcall
block_given? to overload the BlockGivenInstr used by the frameless
defined?(yield) logic. When this instr is used for defined?,
nothing changes. When used for a bare fcall to block_given?, the
logic will first check if the target method is built-in, using the
fast defined?(yield) logic in that case or falling back on a
normal invocation otherwise.

Moving this into an instruction avoids having to add special logic
in CallInstr/CallBase and friends to capture block_given? calls
and give them special behavior. Specifically, the frame flags
specified in the block_given? definition do not have to be
ignored; rather, they only apply in cases where we are calling the
method in other ways, such as on a target object (e.g.
Kernel.block_given?) or via metaprogramming calls like send. This
simplifies the optimization, since BlockGivenInstr itself does not
need a caller frame in order to handle both built-in and custom
block_given?, and any non-direct calls to block_given? continue to
deoptimize in the same way as before.

Performance of a method containing block_given? is now equal to a
method using the less-common defined?(yield) without introducing
any incompatibility.
The latter form avoids any caller frame requirement since it just
takes the block in hand and checks if it is given. The former
form has typically required a caller frame, since it performs a
normal fcall to block_given? which then needs to be able to see
the caller's received block.
@headius headius added this to the JRuby 10.0.0.0 milestone Mar 27, 2024
headius added 3 commits March 27, 2024 01:21
The optimized block_given? still needs to be treated as a call,
which would pollute the flags for defined?(yield) unnecessarily.
This patch moves it to its own instr.
@headius headius marked this pull request as ready for review March 27, 2024 19:20
@headius headius merged commit d9b30bb into jruby:9.5-dev Mar 27, 2024
@headius headius deleted the block_given_poc2 branch March 27, 2024 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant