Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MAINT: Revert boolean casting back to elementwise comparisons in trim_zeros #17058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 19, 2020

Conversation

BvB93
Copy link
Member

@BvB93 BvB93 commented Aug 11, 2020

Followup on #16911:

Replaces the boolean casting added in aforementioned pull request with an (elementwise) comparison to 0.
This his two main consequences:

  1. It further speed ups calls to trim_zeros(). Note that the performance gains seem to vary on a system by system basis (ENH: Speed up trim_zeros #16911 (comment), ENH: Speed up trim_zeros #16911 (comment) & ENH: Speed up trim_zeros #16911 (comment)).
  2. It ensures that the function still works on ndarray subclasses which do not support direct boolean casting such as astropy.Quantity (Test failure with Numpy dev astropy/astropy#10638 & ENH: Speed up trim_zeros #16911 (comment)). Note that the original spirit of the function is retained: finding and trimming leading/trailing 0s.

@BvB93 BvB93 added this to the 1.20.0 release milestone Aug 11, 2020
@BvB93 BvB93 requested a review from mhvk August 11, 2020 14:30

if arr.ndim != 1:
arr_any = np.asanyarray(filt)
arr = arr_any != 0 if arr_any.dtype != bool else arr_any
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prevents converting a bool array into another identical bool array.

arr_any = np.asanyarray(filt)
arr = arr_any != 0 if arr_any.dtype != bool else arr_any

if arr is False:
Copy link
Member Author

@BvB93 BvB93 Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certain data types do not support elementwise comparisons with 0.
Note that the intersection between aforementioned dtypes and those who cannot be cast into boolean arrays is pretty large. In this sense little changes with respect to the previous pull request: str, bytes & structured arrays still aren't supported.

@BvB93 BvB93 mentioned this pull request Aug 11, 2020
@mhvk
Copy link
Contributor

mhvk commented Aug 11, 2020

Looks good to me, but probably best for the original reviewers to make the call.

@mhvk mhvk requested a review from mattip August 11, 2020 14:49
@BvB93 BvB93 requested a review from eric-wieser August 11, 2020 14:50
@eric-wieser
Copy link
Member

eric-wieser commented Aug 11, 2020

I'm not sure if should make this change, as it makes the definition of zero in count_nonzero (of which there are already two conflicting definitions, xref gh-9873) different from that in trim_zeros. For reference, the count_nonzero implementation is:

    # TODO: this works around .astype(bool) not working properly (gh-9847)
    if np.issubdtype(a.dtype, np.character):
        a_bool = a != a.dtype.type()
    else:
        a_bool = a.astype(np.bool_, copy=False)

This manifests as actually differences for object arrays with elements where __bool__() and __eq__(0) are different.

I'm pretty sure an arg_trim function was proposed somewhere or other, which would let this be written y = y[arg_trim(y == 0)], and let the user chose exactly what they want to trim by.

@@ -1631,7 +1631,7 @@ def trim_zeros(filt, trim='fb'):
# Numpy 1.20.0, 2020-07-31
warning = DeprecationWarning(
"in the future trim_zeros will require a 1-D array as input "
"that is compatible with ndarray.astype(bool)"
"that supports elementwise comparisons with zero"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this deprecation to after the old implementation. Lets not worry too much about a possible changed exception type (although if you like that would be fine. But None == 0 always failed, and should fail roughly identically. So we not should give a DeprecationWarning, when a correct error is given!

I do think we should add at least one tests for the == 0 behaviour, and compare it to the old behaviour.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning has been moved as of f977f57.

Can you clarify a bit more about the None == 0? Are you talking about filt=None or filt=[None, ...] here?
The former case would always raise an exception (and still does) while the latter would pass through trim_zeros() without any trimming (as None != 0).
If we go ahead with this pull request then the behavior of the latter will also remain unchanged (though a warning will be issues due to the creation of an object array without explicitly specifying the object dtype).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was talking about the behaviour of the latter. It creates an object array now, I do not think it gives a warning? But even if it does, I would be happy with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It creates an object array now, I do not think it gives a warning?

You're right, I was confused with the deprecation of ragged arrays.

I think the current behavior of not issuing a warning should be fine,
as object arrays will pass through trim_zero() just as they always have:
leading / trailing elements will be trimmed if they are equivalent to 0 and they'll be ignored otherwise.

In any case, I just added a few tests related to object arrays in 26734ef.

Copy link
Member

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either merge this or back out #16911.

@BvB93 BvB93 changed the title MAINT: Replace boolean casting with elementwise comparisons in trim_zeros MAINT: Revert boolean casting back to elementwise comparisons in trim_zeros Aug 14, 2020
@mattip mattip merged commit 05a88ad into numpy:master Aug 19, 2020
@mattip
Copy link
Member

mattip commented Aug 19, 2020

Thanks @BvB93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants