-
-
Notifications
You must be signed in to change notification settings - Fork 11k
ENH: Further expansion of gufunc signature to allow broadcasting #11179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
To be clear, this only enables broadcasting when explicitly requested? I think doing it always would be harmful, and would let bugs happen silently. |
@eric-wieser - yes, by adding |
@@ -138,22 +141,35 @@ Notes: | |||
Each dimension name typically corresponds to one level of looping in the | |||
elementary function's implementation. | |||
#. White spaces are ignored. | |||
#. An integer as a dimension name freezes that dimension to the value, | |||
#. The name can be suffixed with a question mark, this make the dimension a | |||
core dimension only if it exists on the input or output, otherwise 1 is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be something like: otherwise a dimensions of size 1 is inserted into the inputs and squeezed from the outputs.
Also, we should explain exactly when optional dimensions are used. The rule from matmul is that optional dimensions are used whenever there are enough dimensions for them to be well defined, but matmul doesn't have optional core dimensions that appear on multiple arguments.
For example, how should a signature like (n,m?),(n,m?)->()
be interpreted if the two inputs are 1d and 2d? There are two reasonable interpretations I can think of:
- Using the signature
(n),(n)->()
, because not all arguments have enough dimensions. - Raising an error, because the behavior is potentially ambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O dear, what did I do - this discussion really belongs in @mattip's PR... Though in his PR I'm not exactly sure what the behaviour is...
Here, I do know: the moment any argument has too few dimensions, flexible dimensions are effectively removed altogether from the ufunc, i.e., it follows your option 1. I think this is most consistent with matmul
, where, if one were to give an output, one would similarly expect that the dimension would just "not exist".
p.s. And apologies for not adding documentation on this beyond what @mattip had - do look at the tests - I have those for both ?
(flexible) and |1
(broadcasting) - but still need to add some for numerical axes; so far, for those there are just tests that the signature gets parsed right.
(I had a separate PR where it was per argument, but it became clear to me that was just going to be a mess...)
I would like to see the enhanced core signature functionality broken into a number of PRs. It seems this should be the last one, and needs some discussion on the mailing list. |
Just pushed an update to the documentation explaining what the enhancements actually provide (given the rebase conflicts, the tests are currently meaningless, but off-line things pass). |
@mattip - completely agree. I pushed this here mostly so that you could see how it worked with flags, and that it was easy to add further features. Do you have specific suggestions on order? Frozen sizes is by far the easiest, but I'm happy to start with |
The nice thing about |
Actually if I was going to suggest adding one thing to gufuncs besides |
@njsmith - there is actually a very clear use case for broadcasting, the What is your other suggestion for an extension? |
ps. For the frozen dimensions, I'd really like them - right now, I'm hacking the type-resolver to insist on dimensions of 3 in standards-of-astronomy code I'm wrapping in ufuncs; see https://github.com/mhvk/astropy/blob/88312819cdbf78605dfe5b9eb4c8245cae8bb887/astropy/_erfa/ufunc.c.templ#L587 (and see the code above for other hacks I have to do to make the machinery work...) |
a108558
to
3569304
Compare
131f71c
to
1ad1a66
Compare
1ad1a66
to
244682a
Compare
So that we can keep track of expansions of the PyUFuncObject.
Goal is allow signatures like (m?,n),(n,p?)->(m?,p?) for matmul.
This allows future expansion and is arguably easier to follow. Also adds tests, of the signature parsing, of the ignoring of dimensions, and of use of the frozen dimensions via cross1d.
It got very long and now matches logically the later call to get core sizes.
…e equal. Was comparison by the string representation so that "03" and "3" would be considered different dimensions. That would be surprising for implementing an inner loop...
Previously, when a core dimension at the end was missing, the strides were not re-ordered accordingly, and the inner loop code needed to check whether the corresponding dimension was zero. But if the strides are passed on correctly, this is not needed at all, and the dimension can just be unity (not indicating broadcasting this time).
That is, parts like core_num_dims which can get changed during execution, e.g., because keepdims is set or a dimension is flexible.
Now op_core_num_dims to make clear it is the number of core dimensions an operand has, rather than that assumed by the ufunc.
But only test the non-broadcast version since that's all that will work here.
244682a
to
e7f3139
Compare
Closing since the discussion of the NEP concluded that we would defer on adding broadcasting. |
Builds on #11175 to include allowing broadcasting in
gufuncs
(which now is surprisingly easy)>With this, @mattharrigan's
all_equal
(#8528) would become truly useful, as it also allows things likenp.all_equal(ndarray, const, axis=?)
, i.e., one of the arguments can be broadcast over its core.EDIT: has decent tests but does still need docs.