-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: fix choose refcount leak #24188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
np.choose(np.zeros(10000, dtype=int), [a], out=a) | ||
np.choose(np.zeros(10000, dtype=int), [a], out=a) | ||
refc_end = sys.getrefcount(1) | ||
assert refc_end - refc_start < 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could probably go less than 10
here, though I think each incantation can at least do +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you iterate: Randomly realized that this actually stopped doing anything useful on Python 3.12, because IIRC the literal 1 will be an immortal object (and thus it's reference count doesn't change reliably).
0f91f7d
to
a1e5395
Compare
Thanks, we are trying a bit to avoid copyswap; unfortunately, I admit, it would be nice if the replacement was slightly shorter, e.g. see: gh-24176 Happy to do this first, though! Is this much slower for e.g. |
@@ -1074,12 +1077,11 @@ PyArray_Choose(PyArrayObject *ip, PyObject *op, PyArrayObject *out, | |||
break; | |||
} | |||
} | |||
memmove(ret_data, PyArray_MultiIter_DATA(multi, mi), elsize); | |||
copyswap(ret_data, PyArray_MultiIter_DATA(multi, mi), swap, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not adding new copyswap
usages if we can avoid it since it will cause seg faults with new dtypes. Happy to push a fixup here if you don't have time to look at the other PRs replacing copyswap with a single-element cast.
a1e5395
to
7ba92a4
Compare
This PR has been revised based on reviewer feedback to replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this is a lot of boilerplate. It probably makes sense to have a dtype API slot that implements a single-element copy, then you wouldn’t need to deal with the flags or the strides (i think?). You’d still need two stages to get the copy function from the dtype and possibly set up the auxdata, but we could certainly simplify this sort of thing a bunch.
Anyway, thanks for rewriting this in a more forward-compatible way while we wait on polishing out the rough edges in the new dtype API.
@seberg Sounds like this can be improved, but is it good enough for now? |
Just noticed it's also missing the xfree. I am happy with just moving it out of the inner-loop, but I do think that needs to be done. I can do that though if @tylerjereddy prefers. |
I did try hoisting it up last week but got segfaults, will check again today. |
Latest set of changes:
|
Thanks I added three commits to:
Also thought it would be clearer to use EDIT: Also rebased, due to the noblas enforcement changes hitting CI |
* Fixes numpy#22683 * use `copyswap` to avoid the reference count leaking reported above when `np.choose` is used with `out` * my impression from the ticket is that Sebastian doesn't think `copyswap` is a perfect solution, but may suffice short-term?
* remove copyswap, carefully controlling the reference counting to pass the testsuite
* hoist the special `out` handling code out of the inner loop (to the degree the testsuite allowed me to) * add a missing `NPY_cast_info_xfree` * adjust the regression test such that it fails before/passes after on both Python 3.11 and 3.12 beta 4, to deal with PEP 683
Also hoist the `dtype` definition up and use it.
Not sure this makes a difference, but we check for memory overlap so `memmov` isn't necessary and if the compiler keeps the order intact, we want the `memcpy` path to be the hot one.
cf32ef1
to
c176234
Compare
Looks good to me, going to merge. Thanks for pushing this through @tylerjereddy and getting it across the line @seberg! |
Fixes BUG: Choose (and probably more item selection) leaks references with
out=
#22683use
copyswap
to avoid the reference count leaking reported above whennp.choose
is used without
my impression from the ticket is that Sebastian doesn't think
copyswap
is a perfect solution, but may suffice short-term?