Thanks to visit codestin.com
Credit goes to github.com

Skip to content

YJIT: Invalidate i-cache for the other cb on next_page #6631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 26, 2022

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Oct 25, 2022

Follows up #6460.

cb.next_page() could also write some code on cb.other_cb(), and we currently don't invalidate its i-cache on Arm. This PR fixes that problem to avoid Illegal instruction crashes. I confirmed that running yjit-bench's optcarrot on lldb stops reproducing the problem with this patch.

@matzbot matzbot requested a review from a team October 25, 2022 21:24
Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diagnosing this! This is fine, but what do you think about invalidating inside of set_page, where we write out the jump? Around here:

            // Generate jmp_ptr from src_pos to dst_pos
            self.without_page_end_reserve(|cb| {
                cb.add_comment("jump to next page");
                jmp_ptr(cb, dst_ptr);
                assert!(!cb.has_dropped_bytes());
            });

I think in the case where a page is cut short, the current change invalidates a lot more than necessary because it invalidates until the end of the page that was cut short. That can possibly step on PROT_NONE addresses because we didn't actually write that much. We could also pass a slightly different function that does the invalidation in the ARM backend. It'd invalidate the icache twice for the jump in the assembler's code block, but it's only 20 bytes.

@k0kubun
Copy link
Member Author

k0kubun commented Oct 25, 2022

I'm okay with moving it to set_page to simplify the logic, but

I think in the case where a page is cut short, the current change invalidates a lot more than necessary because it invalidates until the end of the page that was cut short. That can possibly step on PROT_NONE addresses because we didn't actually write that much.

I'm not sure if this happens. Sure, without_page_end_reserve might add add a few unnecessary bytes to the previous page, but it's not "a lot more than necessary". The block skipped by the jump is filtered out by writable_addrs, and the range on the next page will be start == end, which may not do anything and is at least not "a lot more than necessary".

@k0kubun
Copy link
Member Author

k0kubun commented Oct 25, 2022

because it invalidates until the end of the page that was cut short

Ah sorry, I finally got the actual situation of this. When other cb jumps from an early position, it does have a lot of gap until the destination. That makes sense. Never mind about my previous reply. I'll write something to improve it.

@k0kubun k0kubun force-pushed the yjit-inval-other-cb branch from bad4faf to deb763c Compare October 25, 2022 22:16
@k0kubun
Copy link
Member Author

k0kubun commented Oct 25, 2022

We could also pass a slightly different function that does the invalidation in the ARM backend.

I'm not sure if this is what you meant, but I addressed the problem you described with deb763c.

@k0kubun
Copy link
Member Author

k0kubun commented Oct 25, 2022

well, okay

It'd invalidate the icache twice for the jump in the assembler's code block, but it's only 20 bytes.

I guess this is what you mean. Maybe we don't want to put aarch64-specific code there, so I can do that instead.

@k0kubun k0kubun force-pushed the yjit-inval-other-cb branch from 67627cb to da92c5c Compare October 25, 2022 22:27
@k0kubun k0kubun force-pushed the yjit-inval-other-cb branch from da92c5c to 6a0b643 Compare October 25, 2022 22:31
@k0kubun
Copy link
Member Author

k0kubun commented Oct 25, 2022

I think this is what you meant 6a0b643.

@k0kubun k0kubun requested a review from a team October 25, 2022 22:31
Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Sorry about the churn.

@jimmyhmiller
Copy link
Contributor

Thanks for finding the root cause here. I know this isn't the first time we've had cache_invalidation issues lead to these sorts of failures. I would love to think about how we could setup our codeblock/page/codegen api to maybe make those sorts of errors a bit harder to make.

Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So glad we're catching these problems early. Thanks Kokubun!

@maximecb maximecb merged commit fa0adba into ruby:master Oct 26, 2022
@maximecb maximecb deleted the yjit-inval-other-cb branch October 26, 2022 15:29
tenderlove pushed a commit to Shopify/ruby that referenced this pull request Oct 27, 2022
* YJIT: Invalidate i-cache for the other cb on next_page

* YJIT: Invalidate only what's written by jmp_ptr

* YJIT: Move the code to the arm64 backend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants