Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: Generalized ufunc signature expansion for frozen and flexible dimensions #11175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Oct 19, 2018

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented May 28, 2018

EDIT: now the replacement of #11132

An alternative to #11132, much more following its logic, but (I think) clearer. Mostly for @mattip to look at. I like the use of flags, although it may seem a bit overkill if we have just one. But I do hope to have a broadcast option (tentatively n|1). For now, I also kept the frozen dimensions, as those are quite easy.

@charris
Copy link
Member

charris commented May 29, 2018

Notifying @shoyer and @mrocklin for these discussions, as they are looking to expand on the array_ufunc idea and this is in the same general area.

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

To get a full sense of what the ideas are, have a look at #11179 (just to be sure, all based on @mattip's work). The test cases may right now provide the best documentation...

@charris
Copy link
Member

charris commented May 29, 2018

Should have @njsmith here also.

@mattip
Copy link
Member

mattip commented May 29, 2018

What is the way forward? Does this replace #11132 or build upon it?

@mattip
Copy link
Member

mattip commented May 29, 2018

After reading the comment in #11132 it seems you want to first implement fixed core dimensions and then add flexible ones?

Edit: it seems to make sense to add them both together since the changes are not really orthogonal. The fixed dimension needs tests for the cross1d function added to _umath_tests, the ones from the comment#5015 (comment)) in the original PR are a start

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

Ah, partly an answer to my question in #11132 - yes, starting with ? and fixed sizes is fine too - #11132 was just to show how easy it was to add broadcasting.

@mattip
Copy link
Member

mattip commented May 29, 2018

@mhvk could you rebase this off master, or do you want me to do it? Also notice there are changes to umath/scalar.c.src which add matrix_multiply. They belong in the final matmul pr #11133 and should be removed from this PR

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

I'm in the process of rebasing - I'll remove the matmul specific commit.

@mhvk mhvk force-pushed the gufunc-signature-modification2 branch from ceffc80 to 517fd34 Compare May 30, 2018 00:20
@mhvk mhvk changed the title WIP: Another alternative ufunc signature expansion for flexible dimensions ENH: Generalized ufunc signature expansion for flexible dimensions May 30, 2018
@mhvk mhvk force-pushed the gufunc-signature-modification2 branch from 517fd34 to 4826b24 Compare May 30, 2018 00:56
@mhvk
Copy link
Contributor Author

mhvk commented May 30, 2018

OK, I rebased this and think it is now ready for review. In rebasing, I used @jaimefrio's original commit for frozen dimensions, so that it is still attributed properly. I similarly used @mattip's commits, but squashed to 2. Note that I rebased on top of #11173 and #11176, to avoid having another difficult rebase after those have been merged.

One note: the doc changes are just @mattip's - I've made more elaborate changes in the broadcasting follow-up, but hoped not to have to split the relevant parts out.

@mhvk mhvk changed the title ENH: Generalized ufunc signature expansion for flexible dimensions ENH: Generalized ufunc signature expansion for frozen and flexible dimensions May 30, 2018
@mhvk mhvk force-pushed the gufunc-signature-modification2 branch 2 times, most recently from b71ef62 to d4c6396 Compare May 30, 2018 15:42
@mhvk
Copy link
Contributor Author

mhvk commented May 30, 2018

Hah, the failing test from 32 bit exposed an interesting bug: previously, @mattip had implemented the flexible dimensions such that the elementary function had to check whether a dimension was zero, and then swap strides. But with matrix-multiply, this changed the behaviour of one of the test cases, in which it is called with matrix_multiply(np.ones((0, 10)), np.ones((10, 0))), i.e., with real zeros for the dimensions.

So, this was clearly fragile. But fortunately also not needed: by just passing on the strides in the correct order, this is solved and, as one would like, the elementary function doesn't have to know about whether flexible dimensions are being used. Much nicer!

A nice side benefit is that matmul now becomes even more trivial to implement, as its code does not have to change at all: it just needs wrapping as a gufunc.

@@ -2199,10 +2362,10 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,

/* Use remapped axes for generalized ufunc */
int broadcast_ndim, iter_ndim;
int core_num_dims_array[NPY_MAXARGS];
int *core_num_dims;
int core_num_dims[NPY_MAXARGS];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per discussion in PR #11176, this should be renamed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think a bit about the logic of the names, since we're copying and adjusting two arrays: core_num_dims and core_dim_sizes. In both cases, the copies represent the actual number of dimensions that the operands have and the sizes that they imply. So, long versions could be
actual_core_num_dims and actual_core_dim_sizes - or remove _core.

Alternatively, I have been wondering whether it would make sense to have a mini-struct actual that had whatever parts of the ufunc would need changing (could be expanded to include core_dim_ixs for "calculated" output indices, e.g.). So that would mean actual->core_num_dims, actual->core_dim_sizes. But this could be done in a separate PR as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. I'm also not so happy that as is, core_num_dims gets copied even if in standard usage it never gets adjusted. But I guess it is only a few numbers, so very little overhead...

@@ -2538,7 +2715,7 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
*/
core_dim_ixs_size = 0;
for (i = 0; i < nop; ++i) {
core_dim_ixs_size += core_num_dims[i];
core_dim_ixs_size += ufunc->core_num_dims[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be precalclulated and stored as part of the ufunc struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered about that, but it seems a bit excessive. Another thing one might do is to store it at the end of core_offsets, so that that really equal cumsum(core_num_dims)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. If we want to go this route, I do think this should be a different PR -- as long as we don't have a release, we are free to change the struct even without a version number change. And my overall sense is that it is rarely really needed: even now, one can just do

core_dim_ixs_size = ufunc->core_offsets[ufunc->nargs - 1] + ufunc->core_num_dims[ufunc->nargs - 1]

@mattip
Copy link
Member

mattip commented May 30, 2018

tests for fixed dim signature still missing? The cross1d function is never used in tests AFAICT.

@mhvk mhvk force-pushed the gufunc-signature-modification2 branch from d4c6396 to a4d86ae Compare May 30, 2018 19:41
@mhvk
Copy link
Contributor Author

mhvk commented May 30, 2018

Rebased to get rid of conflicts, and now including tests of cross1d.

@mattip
Copy link
Member

mattip commented May 30, 2018

@charris, @njsmith, Do you have a sense of how controversial allowing flexible/fixed gufunc core signatures is? Can we go ahead and merge this new feature soonish or does it need another round on the mailing list? Note this is the enabler for changing matmul PR #11133, which needs more work

If it is good to go, it should get an entry in Improvements in doc/release/1.15.0-notes.rst

@mattip
Copy link
Member

mattip commented Oct 10, 2018

reformatted, reworked flag, removed version from struct and docs.

@mhvk
Copy link
Contributor Author

mhvk commented Oct 10, 2018

Piping in from what has become the sidelines: I actually chose the flag with some care, in that if a flag is set, it implies the code needs to do extra work. And a constant size is the obviously simpler case ;-) If the double negative is a problem, it could be UFUNC_CORE_DIM_SIZE_FREE.

Though I will add that the logic become truly obvious only when I added the broadcasting, when I also had UFUNC_CORE_DIM_CAN_BROADCAST.

@mattip
Copy link
Member

mattip commented Oct 11, 2018

I have no particular preference. Note the flag is never read in the code itself, only in tests. Perhaps removing it is another way to resolve the discussion, until the broadcast proposal arises again.

Edit: the check can be rewritten to use ufunc->core_dim_sizes[ix] >= 0 instead of the flag

@mhvk
Copy link
Contributor Author

mhvk commented Oct 11, 2018

@mattip - I think I'd prefer to keep the flag since there is the other one for matmul as well. I also prefer to keep the sense that flag set means work - how about UFUNC_CORE_DIM_FLEXIBLE_SIZE or ...SIZE_FREE - it suggests that the underlying code can handle a non-fixed size.

@eric-wieser
Copy link
Member

I also prefer to keep the sense that flag set means work

That's perhaps a reasonable interpretation.

I don't think it matters too much - the only reason I mention it is this is becoming our public API, so once we choose we're stuck with it.

@eric-wieser
Copy link
Member

How about UFUNC_CORE_DIM_SIZE_INFERRED, which does not form a double negative, and indicates work in the way @mhvk mentions?

@mattip mattip force-pushed the gufunc-signature-modification2 branch from d60316d to 7ef2b3a Compare October 11, 2018 14:19
@mattip
Copy link
Member

mattip commented Oct 11, 2018

Updated to UFUNC_CORE_DIM_SIZE_INFERRED, and enhanced documentation to cross-reference frozen, learning about arbitrary anchors in rst in the process

.. c:member:: npy_intp *PyUFuncObject.core_dim_sizes

For each distinct core dimension, the possible
:ref:`frozen <frozen>` size (``-1`` if not frozen)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using both -1 and UFUNC_CORE_DIM_SIZE_INFERRED to indicate this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhvk ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a leftover from the initial work by Jaime. I think it is fine to remove the guarantee that it is -1 if not frozen, as indeed the flag already indicates that (and perhaps we want to use negative numbers to indicate something else in the future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworded

@@ -43,3 +43,5 @@
# PyArray_SetWritebackIfCopyBase and deprecated PyArray_SetUpdateIfCopyBase.
0x0000000c = a1bc756c5782853ec2e3616cf66869d8

# Version 13 (Numpy 1.16) Size of PyUFuncObject changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this should mention the new struct member names

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


- Never declare a non-pointer instance of the struct
- Never perform pointer arithmatic
- Never use ``sizof(PyUFuncObject)``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

of NumPy. To ensure compatibility:

- Never declare a non-pointer instance of the struct
- Never perform pointer arithmatic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: arithmetic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@mattip mattip force-pushed the gufunc-signature-modification2 branch from 6ac20e2 to 1205e19 Compare October 12, 2018 07:17
@mattip
Copy link
Member

mattip commented Oct 13, 2018

Fixed documentation review issues.

@mattip
Copy link
Member

mattip commented Oct 15, 2018

any more comments or should I squash this down?

Copy link
Contributor Author

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only trivia.

@@ -698,7 +715,7 @@ PyUFunc_Type
PyUFuncGenericFunction *functions;
void **data;
int ntypes;
int reserved1;
int version;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were leaving this as reserved1 (at least that is what is described below).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

* Convert a string into a number
*/
static npy_longlong
_get_size(const char* str)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the signature is still npy_int rather than npy_intp?

@@ -2429,72 +2609,41 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
}
}
/*
* If keepdims is set and true, signal all dimensions will be the same.
* If keepdims is set and true, which means all input dimensions are
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either "which" -> "this", or better, replace the period with a comma (and lower-case "signal")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@mattip
Copy link
Member

mattip commented Oct 16, 2018

fixed comments from @mhvk

@mattip
Copy link
Member

mattip commented Oct 16, 2018

@mhvk, @eric-wieser anything I can do to move this forward?

@mhvk
Copy link
Contributor Author

mhvk commented Oct 17, 2018

@mattip - you fixed my trivial comments, so I'm happy; thanks for carrying this on, after me hijacking it.

@mattip mattip merged commit a2fb23a into numpy:master Oct 19, 2018
@mattip
Copy link
Member

mattip commented Oct 19, 2018

Thanks @mhvk, @eric-wieser for the patience. I merged this even though by now I am probably more considered a contributor than a reviewer, after consulting with core developers in the weekly status meeting.

@mhvk
Copy link
Contributor Author

mhvk commented Oct 19, 2018

@mattip - thank you for first starting, then commenting, and finally shepparding this on! And thanks, @eric-wieser, for the as always very useful comments/critique!

I do hope to get back to working on speeding up the ufuncs... Though perhaps playing with __array_function__ first.

@mhvk mhvk deleted the gufunc-signature-modification2 branch December 21, 2018 02:02
@mhvk mhvk restored the gufunc-signature-modification2 branch December 21, 2018 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants