-
Notifications
You must be signed in to change notification settings - Fork 5.4k
YJIT: Invalidate i-cache for the other cb on next_page #6631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for diagnosing this! This is fine, but what do you think about invalidating inside of set_page, where we write out the jump? Around here:
// Generate jmp_ptr from src_pos to dst_pos
self.without_page_end_reserve(|cb| {
cb.add_comment("jump to next page");
jmp_ptr(cb, dst_ptr);
assert!(!cb.has_dropped_bytes());
});
I think in the case where a page is cut short, the current change invalidates a lot more than necessary because it invalidates until the end of the page that was cut short. That can possibly step on PROT_NONE addresses because we didn't actually write that much. We could also pass a slightly different function that does the invalidation in the ARM backend. It'd invalidate the icache twice for the jump in the assembler's code block, but it's only 20 bytes.
|
Ah sorry, I finally got the actual situation of this. When other cb jumps from an early position, it does have a lot of gap until the destination. That makes sense. Never mind about my previous reply. I'll write something to improve it. |
bad4faf
to
deb763c
Compare
I'm not sure if this is what you meant, but I addressed the problem you described with deb763c. |
well, okay
I guess this is what you mean. Maybe we don't want to put aarch64-specific code there, so I can do that instead. |
67627cb
to
da92c5c
Compare
da92c5c
to
6a0b643
Compare
I think this is what you meant 6a0b643. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry about the churn.
Thanks for finding the root cause here. I know this isn't the first time we've had cache_invalidation issues lead to these sorts of failures. I would love to think about how we could setup our codeblock/page/codegen api to maybe make those sorts of errors a bit harder to make. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So glad we're catching these problems early. Thanks Kokubun!
* YJIT: Invalidate i-cache for the other cb on next_page * YJIT: Invalidate only what's written by jmp_ptr * YJIT: Move the code to the arm64 backend
Follows up #6460.
cb.next_page()
could also write some code oncb.other_cb()
, and we currently don't invalidate its i-cache on Arm. This PR fixes that problem to avoid Illegal instruction crashes. I confirmed that running yjit-bench's optcarrot on lldb stops reproducing the problem with this patch.