Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: Further expansion of gufunc signature to allow broadcasting #11179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented May 28, 2018

Builds on #11175 to include allowing broadcasting in gufuncs (which now is surprisingly easy)>

With this, @mattharrigan's all_equal (#8528) would become truly useful, as it also allows things like np.all_equal(ndarray, const, axis=?), i.e., one of the arguments can be broadcast over its core.

EDIT: has decent tests but does still need docs.

@eric-wieser
Copy link
Member

To be clear, this only enables broadcasting when explicitly requested? I think doing it always would be harmful, and would let bugs happen silently.

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

@eric-wieser - yes, by adding |1 to the dimension name or number -- clearly I should aim to add some docs! Example: for all_equal, the signature would become (n|1),(n|1)->() (as shown by the new example in _umath.tests.src - though the implementation there is a dumb one, unlike #8528).

@@ -138,22 +141,35 @@ Notes:
Each dimension name typically corresponds to one level of looping in the
elementary function's implementation.
#. White spaces are ignored.
#. An integer as a dimension name freezes that dimension to the value,
#. The name can be suffixed with a question mark, this make the dimension a
core dimension only if it exists on the input or output, otherwise 1 is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be something like: otherwise a dimensions of size 1 is inserted into the inputs and squeezed from the outputs.

Also, we should explain exactly when optional dimensions are used. The rule from matmul is that optional dimensions are used whenever there are enough dimensions for them to be well defined, but matmul doesn't have optional core dimensions that appear on multiple arguments.

For example, how should a signature like (n,m?),(n,m?)->() be interpreted if the two inputs are 1d and 2d? There are two reasonable interpretations I can think of:

  1. Using the signature (n),(n)->(), because not all arguments have enough dimensions.
  2. Raising an error, because the behavior is potentially ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O dear, what did I do - this discussion really belongs in @mattip's PR... Though in his PR I'm not exactly sure what the behaviour is...

Here, I do know: the moment any argument has too few dimensions, flexible dimensions are effectively removed altogether from the ufunc, i.e., it follows your option 1. I think this is most consistent with matmul, where, if one were to give an output, one would similarly expect that the dimension would just "not exist".

p.s. And apologies for not adding documentation on this beyond what @mattip had - do look at the tests - I have those for both ? (flexible) and |1 (broadcasting) - but still need to add some for numerical axes; so far, for those there are just tests that the signature gets parsed right.

(I had a separate PR where it was per argument, but it became clear to me that was just going to be a mess...)

@mattip
Copy link
Member

mattip commented May 29, 2018

I would like to see the enhanced core signature functionality broken into a number of PRs. It seems this should be the last one, and needs some discussion on the mailing list.

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

Just pushed an update to the documentation explaining what the enhancements actually provide (given the rebase conflicts, the tests are currently meaningless, but off-line things pass).

@mhvk
Copy link
Contributor Author

mhvk commented May 29, 2018

@mattip - completely agree. I pushed this here mostly so that you could see how it worked with flags, and that it was easy to add further features.

Do you have specific suggestions on order? Frozen sizes is by far the easiest, but I'm happy to start with ? as well, since we can then proceed with matmul. But perhaps for the ufunc struct, we can agree that we'll immediately add both core_dim_sizes and core_dim_flags?

@njsmith
Copy link
Member

njsmith commented May 29, 2018

The nice thing about ? is that it has a clear use case (matmul) and has been discussed a fair amount on the list and elsewhere. For frozen dimensions and broadcasting core dimensions I'm much less comfortable that we understand the benefits and trade-offs of adding them.

@njsmith
Copy link
Member

njsmith commented May 29, 2018

Actually if I was going to suggest adding one thing to gufuncs besides ?, it would be features to let us convert more of the popular numpy functions into gufuncs, like sort.

@mhvk
Copy link
Contributor Author

mhvk commented May 30, 2018

@njsmith - there is actually a very clear use case for broadcasting, the all_equal gufunc written by @mattharrigan (#8528), which makes much more sense with allowing broadcasting of one of the arguments (especially in combination with the axis keyword of #11018), as it can then truly be the short-circuit, memory-saving equivalent of (a == b).all(axis=...)

What is your other suggestion for an extension?

@mhvk
Copy link
Contributor Author

mhvk commented May 30, 2018

ps. For the frozen dimensions, I'd really like them - right now, I'm hacking the type-resolver to insist on dimensions of 3 in standards-of-astronomy code I'm wrapping in ufuncs; see https://github.com/mhvk/astropy/blob/88312819cdbf78605dfe5b9eb4c8245cae8bb887/astropy/_erfa/ufunc.c.templ#L587 (and see the code above for other hacks I have to do to make the machinery work...)

@mhvk mhvk force-pushed the gufunc-allow-broadcasting branch from a108558 to 3569304 Compare May 30, 2018 17:53
@mhvk mhvk changed the title WIP, ENH: Further expansion of gufunc signature to allow broadcasting ENH: Further expansion of gufunc signature to allow broadcasting May 30, 2018
@mhvk mhvk force-pushed the gufunc-allow-broadcasting branch 5 times, most recently from 131f71c to 1ad1a66 Compare June 2, 2018 19:19
@mhvk mhvk force-pushed the gufunc-allow-broadcasting branch from 1ad1a66 to 244682a Compare June 10, 2018 17:21
jaimefrio and others added 7 commits June 11, 2018 15:09
So that we can keep track of expansions of the PyUFuncObject.
Goal is allow signatures like (m?,n),(n,p?)->(m?,p?) for matmul.
This allows future expansion and is arguably easier to follow.

Also adds tests, of the signature parsing, of the ignoring of
dimensions, and of use of the frozen dimensions via cross1d.
It got very long and now matches logically the later call to get
core sizes.
…e equal.

Was comparison by the string representation so that "03" and "3" would
be considered different dimensions.  That would be surprising for implementing
an inner loop...
mhvk and others added 11 commits June 11, 2018 15:10
Previously, when a core dimension at the end was missing, the strides
were not re-ordered accordingly, and the inner loop code needed to
check whether the corresponding dimension was zero. But if the strides
are passed on correctly, this is not needed at all, and the dimension
can just be unity (not indicating broadcasting this time).
That is, parts like core_num_dims which can get changed during
execution, e.g., because keepdims is set or a dimension is flexible.
Now op_core_num_dims to make clear it is the number of core dimensions
an operand has, rather than that assumed by the ufunc.
But only test the non-broadcast version since that's all that will
work here.
@mhvk mhvk force-pushed the gufunc-allow-broadcasting branch from 244682a to e7f3139 Compare June 11, 2018 19:11
@mhvk
Copy link
Contributor Author

mhvk commented Jul 17, 2018

Closing since the discussion of the NEP concluded that we would defer on adding broadcasting.

@mhvk mhvk closed this Jul 17, 2018
@mhvk mhvk deleted the gufunc-allow-broadcasting branch December 21, 2018 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants