Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@nagisa
Copy link
Collaborator

@nagisa nagisa commented Dec 18, 2025

This is a relatively crude hammer that brings the delta between JIT and
interpreter from 17x to 12x (as measured on top of #100) in the
bench_jit_vs_interpreter_empty_for_loop benchmark. On every iteration
we'd spend significant time calling into this small helper function that
returns a non-trivial structure back, requiring significant setup, etc.

This change should also help JIT compilation speed quite a bit, but I
haven't checked this.

The logic flow here feels quite inefficient still, however, and this
specific part of the code still warrants further attention. In
particular, even after this change the machine code eagerly decodes the
entire isntruction into its parts rather than grabbing just the opcode
that it would need to pick the instruction to process. At that point
there might be another instruction word worth of arguments to decode…
It might (or might not!) be better to decode just the opcode and leave
it up to the compiler on how it wants to hoist decoding of the arguments
(if at all.)

@nagisa nagisa force-pushed the nagisa/interpreter-zoomies-2 branch from 5da6328 to e5b673a Compare December 18, 2025 16:45
@nagisa
Copy link
Collaborator Author

nagisa commented Dec 18, 2025

r? @LucasSte @Lichtso cause I can't do that in gh interface.

@nagisa nagisa requested review from Lichtso and LucasSte December 19, 2025 11:23
This is a relatively crude hammer that brings the delta between JIT and
interpreter from 17x to 12x (as measured on top of anza-xyz#100) in the
`bench_jit_vs_interpreter_empty_for_loop` benchmark. On every iteration
we'd spend significant time calling into this small helper function that
returns a non-trivial structure back, requiring significant setup, etc.

This change should also help JIT compilation speed quite a bit, but I
haven't checked this.

The logic flow here feels quite inefficient still, however, and this
specific part of the code still warrants further attention. In
particular, even after this change the machine code eagerly decodes the
entire isntruction into its parts rather than grabbing just the opcode
that it would need to pick the instruction to process. At that point
there might be *another* instruction word worth of arguments to decode…
It might (or might not!) be better to decode just the opcode and leave
it up to the compiler on how it wants to hoist decoding of the arguments
(if at all.)
@nagisa nagisa force-pushed the nagisa/interpreter-zoomies-2 branch from e5b673a to f379b0f Compare December 19, 2025 22:22
@nagisa nagisa merged commit 71d8df0 into anza-xyz:main Dec 19, 2025
10 checks passed
@nagisa nagisa deleted the nagisa/interpreter-zoomies-2 branch December 19, 2025 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants