-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-100726: optimize construction of range object for medium sized integers (version 6) #100810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks good to me; we're missing some error handling in the call to compute_range_length_long
, and I have a few style nitpicks.
Objects/rangeobject.c
Outdated
int overflow = 0; | ||
|
||
long long_start = PyLong_AsLongAndOverflow(start, &overflow); | ||
if (overflow || (long_start==-1 && PyErr_Occurred()) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, there's a slightly ugly problem here. If PyErr_Occurred()
is true then we never clear the exception, so we leave this function with an exception set. That's okay, but then we should be checking PyErr_Occurred()
in the calling function, too. And yes, I think it's true that given that start
is a PyLong_Object
, the current implementation of PyLong_AsLongAndOverflow
can't possibly raise, so this seems like a non-issue. Except that it's not safe to rely on some future version of PyLong_AsLongAndOverflow
not raising.
So we either need to check for PyErr_Occurred()
in the caller, or go back to the idea of a new PyLong API function that's guaranteed not to raise. I think the extra check should be fairly cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I get the point. The current check (long_start==-1 && PyErr_Occurred())
is not good as is never occurs, and if it would occur because the implementation of PyLong_AsLongAndOverflow
changes we are not handling it correctly. Right?
I refactored so the check is like
...
long long_start = PyLong_AsLongAndOverflow(start, &overflow);
if (overflow)
return -1;
if (long_start==-1 && PyErr_Occurred()) {
PyErr_Clear();
return -2;
}
...
For an overflow the documentation of PyLong_AsLongAndOverflow
states there is no exception. For the other errors we check the value of long_start
(which is fast) in combination with PyErr_Occurred()
. If required, we clear the error.
We could also perform the check and clear outside compute_range_length_long
, but in this way all the logic for the fast path is contained inside a single method.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Updated benchmark against main:
The commits addressing the review comments seem to not have changed the performance. |
That looks buggy, 20 should take longer than 10. |
@pochmann You are right. In the test script I only changed the name for the tests, not the statement tested. Here are the updated results:
|
With this PR the Performance could be improved for cases like |
I have made the requested changes; please review again |
Thanks for making the requested changes! @mdickinson: please review the changes made to this pull request. |
Objects/rangeobject.c
Outdated
if (overflow) | ||
return -1; | ||
if (long_start==-1 && PyErr_Occurred()) { | ||
PyErr_Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With apologies for fussing about a case that currently can't even happen: this still isn't quite right. If we get an unexpected exception from PyLong_AsLongAndOverflow
(and right now, any exception from PyLong_AsLongAndOverflow
would count as unexpected), then we'll want to propagate that to the caller rather than clearing it. Otherwise we're doing the C-API equivalent of a Python "except: pass".
So I think all that's needed is to drop the PyErr_Clear()
here, and check for a return of -2
in the calling function.
To avoid too much confusion, we could also consider swapping the return values around (so -2
means overflow and -1
means unexpected exception), since returning -1
for an exception is fairly consistent in the rest of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a point. My line of reasoning was to bail out in case of any error, clear the exception and then continue with the regular path (which could encounter and then handle the same error). I updated the PR to not clear the error, but check for the return value in the calling function. In the hypothetical case (since we both agree it cannot happen) such an error is propagated, we can either clear the error in the caller or propagate again. I have chosen the latter.
The choice for -1 was because that is used for overflow in PyLong_AsLongAndOverflow
. But PyLong_AsLongAndOverflow
also uses -1 for error checking, so agree that -1 is more in line with the rest of the code.
I also added assert statements to check that in compute_range_length
the arguments have PyLong_Check
equal to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdickinson The reasoning about the case that cannot happen is actually quite useful, as there was a bug in the code. The problem was that get_len_of_range
guarantees the results fits into an unsigned long, but we cast to long without any checks. I added a check and a regression test. (not sure what the correct way is to check this is cpython, I just checked the result of the cast is negative)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not sure what the correct way is to check this is cpython, I just checked the result of the cast is negative)
Thanks. That's not totally safe in standard C, since conversions from unsigned to signed are implementation-defined. (C99 §6.3.1.3p3). I think we want to emulate what the other callers of get_len_of_range
do and compare the return value to LONG_MAX
.
I took the liberty of pushing a commit that does this, along with a couple of drive-by style and consistency fixes. If you're okay with the latest commit, I'll merge this PR once CI completes. And if not, I'm happy to revert and/or discuss further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdickinson Thanks for the improvement with the cast. Changes are fine with me.
…to range_fast_path_v6
🤖 New build scheduled with the buildbot fleet by @mdickinson for commit e037563 🤖 If you want to schedule another build, you need to add the |
🤖 New build scheduled with the buildbot fleet by @mdickinson for commit 605256d 🤖 If you want to schedule another build, you need to add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Waiting for buildbots.
Three buildbot failures: a refleak in test_typing, a failure in test_asyncio, and a stack overflow. None of them appear to be related to this issue. |
Thanks for reviewing! |
|
This is a simplified version of #100726.
Benchmark against main: