interpreter: inline ebpf instruction decoding #101

nagisa · 2025-12-18T16:43:24Z

This is a relatively crude hammer that brings the delta between JIT and
interpreter from 17x to 12x (as measured on top of #100) in the
bench_jit_vs_interpreter_empty_for_loop benchmark. On every iteration
we'd spend significant time calling into this small helper function that
returns a non-trivial structure back, requiring significant setup, etc.

This change should also help JIT compilation speed quite a bit, but I
haven't checked this.

The logic flow here feels quite inefficient still, however, and this
specific part of the code still warrants further attention. In
particular, even after this change the machine code eagerly decodes the
entire isntruction into its parts rather than grabbing just the opcode
that it would need to pick the instruction to process. At that point
there might be another instruction word worth of arguments to decode…
It might (or might not!) be better to decode just the opcode and leave
it up to the compiler on how it wants to hoist decoding of the arguments
(if at all.)

nagisa · 2025-12-18T16:56:18Z

r? @LucasSte @Lichtso cause I can't do that in gh interface.

This is a relatively crude hammer that brings the delta between JIT and interpreter from 17x to 12x (as measured on top of anza-xyz#100) in the `bench_jit_vs_interpreter_empty_for_loop` benchmark. On every iteration we'd spend significant time calling into this small helper function that returns a non-trivial structure back, requiring significant setup, etc. This change should also help JIT compilation speed quite a bit, but I haven't checked this. The logic flow here feels quite inefficient still, however, and this specific part of the code still warrants further attention. In particular, even after this change the machine code eagerly decodes the entire isntruction into its parts rather than grabbing just the opcode that it would need to pick the instruction to process. At that point there might be *another* instruction word worth of arguments to decode… It might (or might not!) be better to decode just the opcode and leave it up to the compiler on how it wants to hoist decoding of the arguments (if at all.)

nagisa force-pushed the nagisa/interpreter-zoomies-2 branch from 5da6328 to e5b673a Compare December 18, 2025 16:45

nagisa requested review from Lichtso and LucasSte December 19, 2025 11:23

Lichtso approved these changes Dec 19, 2025

View reviewed changes

nagisa force-pushed the nagisa/interpreter-zoomies-2 branch from e5b673a to f379b0f Compare December 19, 2025 22:22

nagisa merged commit 71d8df0 into anza-xyz:main Dec 19, 2025
10 checks passed

nagisa deleted the nagisa/interpreter-zoomies-2 branch December 19, 2025 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

interpreter: inline ebpf instruction decoding #101

interpreter: inline ebpf instruction decoding #101

Uh oh!

nagisa commented Dec 18, 2025

Uh oh!

nagisa commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

interpreter: inline ebpf instruction decoding #101

interpreter: inline ebpf instruction decoding #101

Uh oh!

Conversation

nagisa commented Dec 18, 2025

Uh oh!

nagisa commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants