Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Ignore np.nan values in Normalize.autoscale() #28406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

epistoteles
Copy link

PR summary

When using colormaps to map scalars to colors, it is necessary to normalize the values into the range [0,1] beforehand. The Normalize class offers this functionality and includes the method autoscale to determine a sensible value vmin and vmax for scaling automatically. However, in its current state, autoscale will set vmin and vmax to nan when presented with the input np.array([1, 2, np.nan]). Because of this, calling an Normalize instance on this will transform the entire array to nan.

Example pre-PR:

>>> x = np.array([1, 2, 3, np.nan])
>>> n = Normalize()
>>> n.autoscale(x)
>>> n(x)
masked_array(data=[nan, nan, nan, nan],
             mask=False,
       fill_value=1e+20)

Example post-PR:

>>> x = np.array([1, 2, 3, np.nan])
>>> n = Normalize()
>>> n.autoscale(x)
>>> n(x)
masked_array(data=[ 0.,  0.5, 1., nan],
             mask=False,
       fill_value=1e+20)

This PR closes #28405.

Additonal comment: the values np.inf and -np.inf currently also break the autoscale function. It is not obvious how the autoscaler should behave when presented with such values. I'd be open to modify this PR in such a way that they also get ignored.

PR checklist

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@greglucas
Copy link
Contributor

greglucas commented Jun 23, 2024

I agree this is a bit of an annoyance, but I also wonder if this would be a bit of a whack-a-mole to try and fix everywhere. For example, there are a few other autoscale_None calls on other norms.

I wonder if it would be easier to use A = cbook.safe_masked_invalid(A) above this so we are entering without any of the invalid possibilities.

FYI: It looks like there have been some micro-optimizations in this area, so maybe people do not want the extra speed penalty to support this? #26335

@timhoffm
Copy link
Member

timhoffm commented Oct 7, 2024

I believe we need a clear understanding and documentation, what A can be.

Typically in colormapping scenarios, A is the array ScalarMappable._A, and set_array() already runs through cbook.safe_masked_invalid(A). So if I'm not mistaken, at the bare minimum it should be sufficient to document that A is not expected to have invalid values. One can only be trapped here if one passes values manually to autoscale_None().

It's t.b.d. whether we want to make that (relatively rare) case be more comfortable as well. And that decision includes a look at the performance tradeof. If it's negligible, we can still do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for author
Development

Successfully merging this pull request may close these issues.

[Bug]: Normalize.autoscale gets broken by np.nan values
3 participants