-
-
Notifications
You must be signed in to change notification settings - Fork 926
Optimized __method__ and __callee__ #8172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The __method__ and __callee__ methods currently force a frame due to the default implementation needing to access the caller's frame to get the method name or compound method name passed on the stack. This optimization moves these calls to a specialized instruction that can access the method/callee name directly without needing a frame, improving performance of methods that use this behavior.
Prior to this patch the compound name passed on the stack from an AliasMethod was repeatedly parsed and split before acquiring the associated symbol, leading to wasteful allocation of additional String objects. The change here adds a new map to the symbol table that tracks the two symbols associated with a compound name, so that only a single lookup is needed and no allocation happens.
Records have optimization characteristics that may be helpful here.
* Cache the last "simple" name to pass through, which should usually be the same name every time. This avoids re-acquiring the symbol when we are looking for the same simple name. * Inject the SymbolTable into the handle chain to avoid having to re-acquire it and the Ruby instance every time. With these changes, simple name acquisition nearly doubles in performance without indy, and all forms of acquisition improve by around 5-6x. This is in addition to the doubled performance from the original optimization.
This mimics the JIT's logic to do the same when loading the "frameName" and is necessary to avoid regressing __method__ behavior inside peculiar contexts (like define_method or eval).
Additional optimizations:
With all changes, final bench numbers look like this:
Note that the slow compound name path appears to have some volatility in the JVM JIT, and occasionally now drops about 25% of its performance:
This may be due to the additional optimizations confounding HotSpot sometimes, or it may have always been there but not during the benchmark runs I performed. |
There's no super and __method__ should return nil inside module and class bodies, so we pass null to indicate this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR optimizes the
__method__
and__callee__
methods by doing the following:__method__
or__callee__
is the built-in version (similar toblock_given?
in Implement block_given? call as optimized instruction #8170).__method__
and__callee__
, avoiding extra allocation of strings while parsing that compound string.As with #8170, if either of the target methods have been replaced by a user, we fall back to a normal invocation. If either method is called via metaprogramming paths, they will force a frame and use it as before.
A benchmark is included, and shows that both forms are now much faster (due to them no longer needing the caller's frame), and neither slow down when being used inside an aliased call: