Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Masked fields should not be used in comparison #15978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cosama opened this issue Apr 14, 2020 · 1 comment
Open

Masked fields should not be used in comparison #15978

cosama opened this issue Apr 14, 2020 · 1 comment
Labels
component: numpy.ma masked arrays

Comments

@cosama
Copy link

cosama commented Apr 14, 2020

If masked arrays are compared (through the >, >=, < and <= operators to constants, arrays or other masked arrays they access the masked out elements for comparison. In some extreme cases this fails, often causing a warning.

Reproducing code example:

This example

import numpy as np
d = np.ma.array([1, None], mask=[False, True])
d > 0

fails with a TypeError: '>' not supported between instances of 'NoneType' and 'int'.

This example

import numpy as np
d = np.ma.array([1, np.nan], mask=[False, True])
d > 0

throws a warning __main__:1: RuntimeWarning: invalid value encountered in greater.

I think masked out elements should be set to False by default and not compared, so that operations such as:

d[d > 0] = 0

have the expected behavior. However that might cause issues in other cases such as:

np.all(d > 0)

This issue has been brought up before here: #4959, but the initial issue there seems more about using default functions such as np.log with masked array insted of the np.ma relative.

Even adding a fill_value seems not to fix this. At least in that case I would assume that for elements that are masked out the comparison happens with respect to the fill value.

import numpy as np
d = np.ma.array([1, None], mask=[False, True], fill_value=-1)
d > 0

creates the same error as mentioned above.

Numpy/Python version information:

  • Numpy: '1.18.2'
  • System: '3.7.6 (default, Jan 30 2020, 09:44:41) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]'
@cosama
Copy link
Author

cosama commented May 2, 2020

Thinking about this a bit more, I think comparison operators should all return masked boolean arrays. Masked out elements in any of the compared arrays should stay masked in the returned array, all elements unmasked in both arrays should be returned unmasked with the comparison operator evaluated. Furthermore, it should be possible to index arrays with masked arrays, where only unmasked elements are accessed (boolean indexing or integer indexing). The returned arrays, probably have to be masked again, with masked elements in the indexing matrix being masked. That would mean that also standard numpy.ndarray need to know how to be indexed through masked arrays.

This seems to be a major change, so I'm not sure if it is within the scope of numpa.ma, could be implemented in an external library as proposed in #16022.

@rossbar rossbar added the component: numpy.ma masked arrays label Jul 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: numpy.ma masked arrays
Projects
None yet
Development

No branches or pull requests

2 participants