Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Inconsistent behaviour of masked arrays for equivalent operations #16359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tirthasheshpatel opened this issue May 23, 2020 · 2 comments
Closed
Labels
33 - Question Question about NumPy usage or development

Comments

@tirthasheshpatel
Copy link
Contributor

When a np.array (say a) is added with np.ma.masked_array (say ma) using a += ma operator, it gives a np.array output while a = a + ma gives a np.ma.maked_array output.

Reproducing code example:

>>> import numpy as np
>>> A = np.arange(10)
>>> ma = np.ma.masked_array(A, A>4)
>>> ma
masked_array(data=[0, 1, 2, 3, 4, --, --, --, --, --],
             mask=[False, False, False, False, False,  True,  True,  True,
                    True,  True],
       fill_value=999999)
>>> A += ma
>>> A
array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])
>>> A = A + ma
>>> A
masked_array(data=[0, 4, 8, 12, 16, --, --, --, --, --],
             mask=[False, False, False, False, False,  True,  True,  True,
                    True,  True],
       fill_value=999999)
>>>
import numpy as np
<< your code here >>

Error message:

Numpy/Python version information:

>>> np.__version__
'1.18.4'
@tirthasheshpatel
Copy link
Contributor Author

I expected a masked array as an output for a += ma but I guess the implementation always outputs the same object when operated in-place. Technically, there is no issue there as += is in-place operator and so an np.ndarray remains np.ndarray. Can this still be made a new feature though? I am not sure so feel free to close this :)

@rossbar rossbar added the 33 - Question Question about NumPy usage or development label Jul 11, 2020
@rossbar
Copy link
Contributor

rossbar commented Jul 11, 2020

It seems like you've very nicely answered your own question. Generally, __array_priority__ is used to determine the output type of operations (higher array_priority wins). E.g. for masked arrays:

>>> a = np.arange(10)
>>> a.__array_priority__
0.0
>>> m = np.ma.arange(10)
>>> m.__array_priority__
15.0

This means that operations between arrays and masked arrays should return masked arrays:

>>> type(a + m)
numpy.ma.core.MaskedArray
>>> type(m + a)
numpy.ma.core.MaskedArray

In-place operations get a bit hairy though, as the behavior is generally ill-defined. For example:

>>> m = np.ma.arange(10)
>>> m.mask = [True]*5 + [False]*5
>>> a += m # What do you expect

You might (reasonably) expect a = array([ 0, 1, 2, 3, 4, 10, 12, 14, 16, 18]), i.e. the output type is ndarray, but the mask from m was respected in the operation. This is currently not the case. The behavior could be changed, but this would require a larger discussion. See e.g. this discussion if you are interested.

I think the original question was sufficiently answered, so I will close this - feel free to re-open or start/join an existing discussion if you are interested in some of these details.

@rossbar rossbar closed this as completed Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
33 - Question Question about NumPy usage or development
Projects
None yet
Development

No branches or pull requests

2 participants