-
-
Notifications
You must be signed in to change notification settings - Fork 11k
ENH: Generalized ufunc signature expansion for frozen and flexible dimensions #11175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Should have @njsmith here also. |
What is the way forward? Does this replace #11132 or build upon it? |
After reading the comment in #11132 it seems you want to first implement fixed core dimensions and then add flexible ones? Edit: it seems to make sense to add them both together since the changes are not really orthogonal. The fixed dimension needs tests for the |
I'm in the process of rebasing - I'll remove the matmul specific commit. |
ceffc80
to
517fd34
Compare
517fd34
to
4826b24
Compare
OK, I rebased this and think it is now ready for review. In rebasing, I used @jaimefrio's original commit for frozen dimensions, so that it is still attributed properly. I similarly used @mattip's commits, but squashed to 2. Note that I rebased on top of #11173 and #11176, to avoid having another difficult rebase after those have been merged. One note: the doc changes are just @mattip's - I've made more elaborate changes in the broadcasting follow-up, but hoped not to have to split the relevant parts out. |
b71ef62
to
d4c6396
Compare
Hah, the failing test from 32 bit exposed an interesting bug: previously, @mattip had implemented the flexible dimensions such that the elementary function had to check whether a dimension was zero, and then swap strides. But with matrix-multiply, this changed the behaviour of one of the test cases, in which it is called with So, this was clearly fragile. But fortunately also not needed: by just passing on the strides in the correct order, this is solved and, as one would like, the elementary function doesn't have to know about whether flexible dimensions are being used. Much nicer! A nice side benefit is that |
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -2199,10 +2362,10 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc, | |||
|
|||
/* Use remapped axes for generalized ufunc */ | |||
int broadcast_ndim, iter_ndim; | |||
int core_num_dims_array[NPY_MAXARGS]; | |||
int *core_num_dims; | |||
int core_num_dims[NPY_MAXARGS]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per discussion in PR #11176, this should be renamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to think a bit about the logic of the names, since we're copying and adjusting two arrays: core_num_dims
and core_dim_sizes
. In both cases, the copies represent the actual number of dimensions that the operands have and the sizes that they imply. So, long versions could be
actual_core_num_dims
and actual_core_dim_sizes
- or remove _core
.
Alternatively, I have been wondering whether it would make sense to have a mini-struct actual
that had whatever parts of the ufunc would need changing (could be expanded to include core_dim_ixs
for "calculated" output indices, e.g.). So that would mean actual->core_num_dims
, actual->core_dim_sizes
. But this could be done in a separate PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. I'm also not so happy that as is, core_num_dims
gets copied even if in standard usage it never gets adjusted. But I guess it is only a few numbers, so very little overhead...
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -2538,7 +2715,7 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc, | |||
*/ | |||
core_dim_ixs_size = 0; | |||
for (i = 0; i < nop; ++i) { | |||
core_dim_ixs_size += core_num_dims[i]; | |||
core_dim_ixs_size += ufunc->core_num_dims[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be precalclulated and stored as part of the ufunc struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered about that, but it seems a bit excessive. Another thing one might do is to store it at the end of core_offsets
, so that that really equal cumsum(core_num_dims)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. If we want to go this route, I do think this should be a different PR -- as long as we don't have a release, we are free to change the struct even without a version number change. And my overall sense is that it is rarely really needed: even now, one can just do
core_dim_ixs_size = ufunc->core_offsets[ufunc->nargs - 1] + ufunc->core_num_dims[ufunc->nargs - 1]
tests for fixed dim signature still missing? The |
d4c6396
to
a4d86ae
Compare
Rebased to get rid of conflicts, and now including tests of |
@charris, @njsmith, Do you have a sense of how controversial allowing flexible/fixed gufunc core signatures is? Can we go ahead and merge this new feature soonish or does it need another round on the mailing list? Note this is the enabler for changing If it is good to go, it should get an entry in |
reformatted, reworked flag, removed version from struct and docs. |
Piping in from what has become the sidelines: I actually chose the flag with some care, in that if a flag is set, it implies the code needs to do extra work. And a constant size is the obviously simpler case ;-) If the double negative is a problem, it could be Though I will add that the logic become truly obvious only when I added the broadcasting, when I also had |
I have no particular preference. Note the flag is never read in the code itself, only in tests. Perhaps removing it is another way to resolve the discussion, until the broadcast proposal arises again. Edit: the check can be rewritten to use |
@mattip - I think I'd prefer to keep the flag since there is the other one for |
That's perhaps a reasonable interpretation. I don't think it matters too much - the only reason I mention it is this is becoming our public API, so once we choose we're stuck with it. |
How about |
d60316d
to
7ef2b3a
Compare
Updated to UFUNC_CORE_DIM_SIZE_INFERRED, and enhanced documentation to cross-reference |
.. c:member:: npy_intp *PyUFuncObject.core_dim_sizes | ||
|
||
For each distinct core dimension, the possible | ||
:ref:`frozen <frozen>` size (``-1`` if not frozen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using both -1
and UFUNC_CORE_DIM_SIZE_INFERRED
to indicate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhvk ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a leftover from the initial work by Jaime. I think it is fine to remove the guarantee that it is -1
if not frozen, as indeed the flag already indicates that (and perhaps we want to use negative numbers to indicate something else in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded
@@ -43,3 +43,5 @@ | |||
# PyArray_SetWritebackIfCopyBase and deprecated PyArray_SetUpdateIfCopyBase. | |||
0x0000000c = a1bc756c5782853ec2e3616cf66869d8 | |||
|
|||
# Version 13 (Numpy 1.16) Size of PyUFuncObject changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this should mention the new struct member names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
- Never declare a non-pointer instance of the struct | ||
- Never perform pointer arithmatic | ||
- Never use ``sizof(PyUFuncObject)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
of NumPy. To ensure compatibility: | ||
|
||
- Never declare a non-pointer instance of the struct | ||
- Never perform pointer arithmatic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: arithmetic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
6ac20e2
to
1205e19
Compare
Fixed documentation review issues. |
any more comments or should I squash this down? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only trivia.
@@ -698,7 +715,7 @@ PyUFunc_Type | |||
PyUFuncGenericFunction *functions; | |||
void **data; | |||
int ntypes; | |||
int reserved1; | |||
int version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were leaving this as reserved1
(at least that is what is described below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
* Convert a string into a number | ||
*/ | ||
static npy_longlong | ||
_get_size(const char* str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the signature is still npy_int
rather than npy_intp
?
@@ -2429,72 +2609,41 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc, | |||
} | |||
} | |||
/* | |||
* If keepdims is set and true, signal all dimensions will be the same. | |||
* If keepdims is set and true, which means all input dimensions are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either "which" -> "this", or better, replace the period with a comma (and lower-case "signal")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
fixed comments from @mhvk |
@mhvk, @eric-wieser anything I can do to move this forward? |
@mattip - you fixed my trivial comments, so I'm happy; thanks for carrying this on, after me hijacking it. |
Thanks @mhvk, @eric-wieser for the patience. I merged this even though by now I am probably more considered a contributor than a reviewer, after consulting with core developers in the weekly status meeting. |
@mattip - thank you for first starting, then commenting, and finally shepparding this on! And thanks, @eric-wieser, for the as always very useful comments/critique! I do hope to get back to working on speeding up the ufuncs... Though perhaps playing with |
EDIT: now the replacement of #11132
An alternative to #11132, much more following its logic, but (I think) clearer. Mostly for @mattip to look at. I like the use of flags, although it may seem a bit overkill if we have just one. But I do hope to have a broadcast option (tentatively
n|1
). For now, I also kept the frozen dimensions, as those are quite easy.