-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Get String#crypt working with multi-ractor in cases where !HAVE_CRYPT_R #13567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get String#crypt working with multi-ractor in cases where !HAVE_CRYPT_R #13567
Conversation
8d25d6f
to
eb97984
Compare
string.c
Outdated
size_t res_size = strlen(res)+1; | ||
char *dup = malloc(res_size); // need to hold onto lock while duplicating a potentially static buffer | ||
memcpy(dup, res, res_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for readers: We have to do this because of our #define strdup ruby_strdup
. We want to malloc here, not xmalloc. And I don't think there's a portable way to determine the size of crypt's output buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
res = (strdup)(res);
can avoid the macro.
string.c
Outdated
// Don't allocate a ruby object while holding this lock, we could hit a VM barrier, which | ||
// causes a deadlock if other ractors are waiting on this lock. | ||
result = rb_str_new_cstr(dup); | ||
free(dup); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rb_str_new_cstr
erorrs, this pointer might leak.
I think that may not be worth dealing with as crypt is semi-deprecated https://bugs.ruby-lang.org/issues/14915
I prefer that (very unlikely) leak to the deadlock that will occur in the test suite under ractors.
string.c
Outdated
result = rb_str_new_cstr(res); | ||
|
||
size_t res_size = strlen(res)+1; | ||
char *dup = malloc(res_size); // need to hold onto lock while duplicating a potentially static buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encrypted result would be small enough in alloca
.
And this copy should not be needed if crypt_r
is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I've updated the PR to use alloca
.
Edit: Some CI failures I'll look into it Friday.
eb97984
to
e65f343
Compare
This comment has been minimized.
This comment has been minimized.
e65f343
to
1d25406
Compare
In commit 12f7ba5, ractor safety was added to String#crypt, however in certain cases it can cause a deadlock. When we lock a native mutex, we cannot allocate ruby objects because they might trigger GC which starts a VM barrier. If the barrier is triggered and other native threads are waiting on this mutex, they will not be able to be woken up in order to join the barrier. To fix this, we don't allocate ruby objects when we hold the lock. The following could reproduce the problem: ```ruby strings = [] 10_000.times do |i| strings << "my string #{i}" end STRINGS = Ractor.make_shareable(strings) rs = [] 100.times do rs << Ractor.new do STRINGS.each do |s| s.dup.crypt(s.dup) end end end while rs.any? r, obj = Ractor.select(*rs) rs.delete(r) end ``` I will not be adding tests because I am almost finished a PR to enable running test-all test cases inside many ractors at once, which is how I found the issue. Co-authored-by: jhawthorn <[email protected]>
1d25406
to
f97389e
Compare
@nobu, @jhawthorn: ready for re-review |
In commit 12f7ba5, ractor safety was added to String#crypt, however in certain cases it can cause a deadlock. When we lock a native mutex, we cannot allocate ruby objects because they might trigger GC which starts a VM barrier. If the barrier is triggered and other native threads are waiting on this mutex, they will not be able to be woken up in order to join the barrier. To fix this, we don't allocate ruby objects when we hold the lock.
The following could reproduce the problem:
I will not be adding tests because I am almost finished a PR to enable running test-all test cases inside many ractors at once, which is how I found the issue.