-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: where for ufunc reductions #12644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good!
numpy/core/src/umath/ufunc_object.c
Outdated
npy_intp count = *countptr; | ||
char *maskptr = dataptrs[2]; | ||
char mask = *maskptr; | ||
if (strides[2] != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a mask_stride
alias for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
numpy/core/src/umath/reduction.c
Outdated
goto fail; | ||
} | ||
op_flags[2] = NPY_ITER_READONLY | | ||
NPY_ITER_ALIGNED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the aligned for here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I guess for bools that is pretty senseless. Removed.
numpy/core/src/umath/reduction.c
Outdated
@@ -493,6 +484,13 @@ PyUFunc_ReduceWrapper(PyArrayObject *operand, PyArrayObject *out, | |||
Py_INCREF(op_view); | |||
} | |||
else { | |||
/* Cannot use where when we initialize from the operand */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prevents passing where=np.True_
into identity-less ufuncs, despite allowing where=True
. I think we might want to detect empty masks on a per iterator-loop basis, to allow both.
Fine to leave to another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I prefer to leave this for another PR, as it needs some thought about what the ideal behaviour is (or even how to check whether an empty mask occurs).
numpy/core/src/umath/reduction.c
Outdated
if (wheremask != NULL) { | ||
PyErr_SetString(PyExc_RuntimeError, | ||
"Reduce operations with no identity only support " | ||
"a where mask if 'initial' is passed in."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to include the ufunc name in this message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Actually, what error do you think this should be? RunTimeError
is more for something that is unexpected, where this is not. Just ValueError
? I guess I should catch it earlier, in the keyword getting stage (can keep it here just in case, though ReduceWrapper
is not part of the C API so it is just for internal checking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it a ValueError
just above, where we check multiple axes and reorderable
as well.
Neat!
Can you explain why? |
numpy/core/src/umath/ufunc_object.c
Outdated
goto fail; | ||
} | ||
} | ||
else { | ||
if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|OO&O&iO:reduce", | ||
if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|OO&O&iOO:reduce", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a $ here to make where keyword-only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the way sum
is done in _methods.py
, this needs to be kept as a positional argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, only if we want to keep the 10% performance boost on tiny arrays. I suppose most users will pass by keyword-argument anyway, so there's little to gain by requiring it to be keyword-only.
I think I may rather want to skip the |
@shoyer - @njsmith - the reason for having to pass in |
@mhvk sorry, I think I interpreted it the wrong way around. I though |
4d8990a
to
b9b03cd
Compare
OK, now with initial comments addressed and more tests and docs. |
So if you have a reduction that has no identity, and you try to use |
Right now, without an identity and initial, using |
Without doing it, I don't think we can use it to implement things like summary masked arrays of dtype=object |
numpy/core/_add_newdocs.py
Outdated
@@ -4876,7 +4876,7 @@ | |||
|
|||
add_newdoc('numpy.core', 'ufunc', ('reduce', | |||
""" | |||
reduce(a, axis=0, dtype=None, out=None, keepdims=False, initial) | |||
reduce(a, axis=0, dtype=None, out=None, keepdims=False, initial, *, where=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This *
is not true - I assume it's a remnant of the rejected suggestion to implement it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, corrected.
numpy/core/src/umath/override.c
Outdated
/* Remove initial=np._NoValue */ | ||
if (i == 5 && obj == NoValue) { | ||
/* Remove {initial,where}=np._NoValue */ | ||
if (i >= 5 && obj == NoValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is where=np._NoValue
even allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets used by sum
etc in fromnumeric
. In principle, though, those also strip it away again (unless overridden with __array_function__
). Unlike initial
, it does not get used in _methods
, so I think it can indeed be removed. Will do so.
I don't quite follow. Currently, the p.s. For object arrays one could pass in a special |
@eric-wieser - on |
p.s. I guess the first use of this may be in |
That's a really neat idea!
There's no need to pass I think you're still going to have a bad time using |
numpy/core/src/umath/ufunc_object.c
Outdated
/* | ||
* Optimization: where=True is the same as no where argument. | ||
* This lets us document it as a default argument. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This optimization needs to come before calling PyArray_DescrFromType
, else you leak dtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think you might leak it no matter what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyArray_FromAny
steals the reference to the dtype
, so indeed the dtype
should be created inside the if
clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, turns out I wrote a _wheremask_converter
when redoing the keyword parsing for the normal ufunc
. Might as well use it...
Other than the reference leak above, this looks good to go for me |
2dcb83c
to
bb9b9bb
Compare
OK, fixed the reference issue by re-using |
bb9b9bb
to
489483d
Compare
Re: where this might be used: I have a ducktype'd MaskedArray available here (link) To use this PR there I would just need to modify the |
Also, one case I thought about a lot has to do with non-associative (?) reductions. For instance: >>> np.equal.reduce([False, False])
True
>>> np.equal.reduce([False, False, False])
False if you work it out, you'll see that the end result depends on how successive values alternate. Therefore, there is a difference between replacing a value by an identity element before reducing (my strategy), and not using the element in the reduction in the first place (this PR, correct?). |
Right, but |
@ahaldane - yes, I think this should work well for the general masking case. The one problem I noted earlier is that whatever is being masked has to be able to deal with p.s. Will look at your |
In this implementation, if the ufunc does not have an identity, it needs an initial vavlue to be supplied.
489483d
to
5afe650
Compare
Nice and simple enhancement. Would be nice to add the comparison benchmark at some point |
Yeah, looked good to me as well. Only thing to maybe quickly follow up on: Do we want to allow allow non-boolean masks? I think I would prefer to not allow it for now but the code seems to me like it does, and there should be a test for it probably. |
Sorry, nevermind, the loop uses "safe" casting, thought I skimmed an unsafe cast there somewhere. |
I added a new issue #12662 to use this in |
@seberg - mask array creation and casting is now identical to what is used for the regular ufuncs, so OK to the extent those are OK! |
I added a commit using it in the MaskedArray ducktype, and it passes my test suite and gives a fair speedup. Setup: >>> np.random.seed(12345)
>>> x = np.random.rand(515, 512)
>>> a = MaskedArray(x, x < 0.5) Before this PR: >>> %timeit np.add.reduce(a, axis=1)
1.63 ms ± 8.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) After this PR: >>> %timeit np.add.reduce(a, axis=1)
1.39 ms ± 2.37 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) The case I tested might be a worst-case for this PR, too, since it has lots of alternating masked elements. |
@ahaldane - happy to see that it helps! And, yes, that'll definitely be close to worst-case performance. |
Apologies, I mixed myself up on the timings. Here is the correct comparison: Before: >>> a = MaskedArray(x, x < 0.9)
>>> %timeit np.add.reduce(a, axis=1)
751 µs ± 3.55 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> a = MaskedArray(x, x < 0.5)
>>> %timeit np.add.reduce(a, axis=1)
1.4 ms ± 3.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> a = MaskedArray(x, x < 0.1)
>>> %timeit np.add.reduce(a, axis=1)
748 µs ± 16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) After: >>> a = MaskedArray(x, x < 0.9)
>>> %timeit np.add.reduce(a, axis=1)
600 µs ± 1.97 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> a = MaskedArray(x, x < 0.5)
>>> %timeit np.add.reduce(a, axis=1)
1.63 ms ± 3.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> a = MaskedArray(x, x < 0.1)
>>> %timeit np.add.reduce(a, axis=1)
684 µs ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) So if I did the timings right it seems be faster in the two biased cases, but slower in the worst case. |
@ahaldane - OK, pity that it isn't always faster! I think in part it may be the final change of doing this in a nested |
Yeah, the only way to get better is probably to have specialized inner loop functions that handle the mask internally... Which makes it a ternary ufunc... |
@seberg - I've made some progress with "chaining ufuncs", which execute inner loops in sequence (see https://github.com/mhvk/chain_ufunc). I think those could solve problems like this too. |
@mhvk yeah, chaining is another cool addition that is probably reasonably hard to achieve even. It could achieve something similar, although not for optimization purposes (that would basically need a dedicated specialization). I wonder how far we should go down the line doing things similar to numexpr or libraries that do lazy evaluation with optimization. |
Agreed that it is not clear how far one should go. The chaining I was looking at was mostly for use in quantities, where often you need to multiply one input with a constant to get the right units for the operation. It helps less than I hoped... |
Note: This is a simpler and more robust, yet faster, version of #12635 and #12640 (way too much time wasted on being clever, but the comments helped steer to this better solution).
This introduces a
where
keyword for reductions. It works well, but for ufuncs with no identity, one has to explicitly pass in aninitial
. If people agree this is the way to go, I will add documentation and more tests.Unlike for my other attempts, this performs quite well and I think we should consider using it in
nanfunctions
and inMaskedArray
: