BUG: error in printing masked arrays #7621

fonnesbeck · 2016-05-11T16:31:49Z

Numpy masked arrays do not fill correctly when constructed. In the example below, the values filled do not correspond to the mask (notice the big chunk of values in the middle that get filled when they should not:

The same happens with other masked array constructors, such as masked_equal. Strangely, if I use nan as the mask value, it works as expected.

Running NumPy 1.11.0 on Python 3.5.1 (OS X 10.11.4)

The text was updated successfully, but these errors were encountered:

abalkin · 2016-05-11T18:47:10Z

Please don't post screen shots. Copy and paste your session instead. Please try to reproduce your problem with a smaller array.

fonnesbeck · 2016-05-11T18:59:42Z

import numpy as np
foo = np.array([  30.  ,   61.  ,   31.  ,   37.  ,    6.  ,    2.  ,  132.  ,
         27.  ,   38.  ,   48.7 ,    3.  ,   72.  ,   37.5 ,    5.1 ,
         48.  ,   20.2 ,   26.  ,    1.8 ,   15.3 ,   30.4 ,    4.5 ,
          8.  ,   13.  ,   31.  ,   51.  ,   36.  ,   42.  ,   42.  ,
         34.  ,   21.  ,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,   38.  ,    9.  ,   42.  ,   27.  ,   17.  ,   39.  ,
         29.  ,   58.  ,  137.  ,   13.  ,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11])
np.ma.masked_equal(foo, value=1.11)

yields:

masked_array(data = [30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --],
             mask = [False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True False False False False False False False False False False
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True],
       fill_value = 1.11)

abalkin · 2016-05-11T19:03:10Z

Hmm, I get

In [72]: np.ma.masked_equal(foo, value=1.11)
Out[72]:
masked_array(data = [30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 38.0 9.0 42.0
 27.0 17.0 39.0 29.0 58.0 137.0 13.0 -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- --],
             mask = [False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True False False False False False False False False False False
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True],
       fill_value = 1.11)

abalkin · 2016-05-11T19:07:57Z

I noticed that the mask shows up correctly in your output.

fonnesbeck · 2016-05-11T19:08:05Z

This may be a repr issue, as when I ask for a filled array, it seems to do the right thing:

time_masked.filled()
Out[64]:
array([  30.  ,   61.  ,   31.  ,   37.  ,    6.  ,    2.  ,  132.  ,
         27.  ,   38.  ,   48.7 ,    3.  ,   72.  ,   37.5 ,    5.1 ,
         48.  ,   20.2 ,   26.  ,    1.8 ,   15.3 ,   30.4 ,    4.5 ,
          8.  ,   13.  ,   31.  ,   51.  ,   36.  ,   42.  ,   42.  ,
         34.  ,   21.  ,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,   38.  ,    9.  ,   42.  ,   27.  ,   17.  ,   39.  ,
         29.  ,   58.  ,  137.  ,   13.  ,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11,    1.11,    1.11,    1.11,    1.11,    1.11,
          1.11,    1.11])

abalkin · 2016-05-11T19:12:22Z

I've upgraded to numpy 1.11.0 and I now see your problem:

In [3]: np.ma.masked_equal(foo, value=1.11)
Out[3]:
masked_array(data = [30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --],
             mask = [False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True False False False False False False False False False False
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True],
       fill_value = 1.11)

abalkin · 2016-05-11T19:13:11Z

Same issue with __str__:

In [4]: print(np.ma.masked_equal(foo, value=1.11))
[30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]

abalkin · 2016-05-11T19:25:04Z

Here are some really odd displays:

In [39]: print(np.ma.masked_equal(foo[:109], value=1.11))
[30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 13.0 -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]

In [40]: print(np.ma.masked_equal(foo[:108], value=1.11))
[30.0 61.0 31.0 37.0 6.0 2.0 132.0 27.0 38.0 48.7 3.0 72.0 37.5 5.1 48.0
 20.2 26.0 1.8 15.3 30.4 4.5 8.0 13.0 31.0 51.0 36.0 42.0 42.0 34.0 21.0 --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 137.0 13.0 -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]

abalkin · 2016-05-11T19:30:58Z

Here is a clearer case demonstrating the problem:

In [46]: a = np.arange(120)

In [47]: a[30:50] = a[60:] = -1

In [48]: print(np.ma.masked_equal(a, value=-1))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
 28 29 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --]

charris · 2016-05-11T19:52:43Z

I'm going to guess this came in with #6748.

ahaldane · 2016-05-11T19:54:12Z

I don't have time to check but #6094 also comes to mind as well.

abalkin · 2016-05-11T20:22:44Z

The problem is related to _print_width introduced in b5c456e.

Here is a simple demonstration:

In [84]: np.ma.MaskedArray._print_width = 10

In [85]: print(np.ma.masked_values([0]*120, 0))
[-- -- -- -- -- -- -- -- -- --]

The default value for _print_width is 100.

abalkin · 2016-05-11T20:27:11Z

@saimn - it looks like you've done some work in this area recently.

abalkin · 2016-05-11T20:42:12Z

I think the problem was introduced in #3544 / #6748.

saimn · 2016-05-11T20:42:13Z

@abalkin - Yes indeed. I did this while working / checking on 2D/3D arrays I guess, and it seems that the rule for the number on values printed on screen for 1D arrays is different ... I don't know exactly what is the rule / limit, and when filling ... are used, but this can probably be fixed by increasing the _print_width value ? This idea of truncating the array is mostly relevant for big arrays anyway, so having a higher value for _print_width is fine. The question is which value to choose ...

abalkin · 2016-05-11T20:50:48Z

@saimn - I cannot understand your logic in 593345a. It looks like you drop the middle values from the array leaving no indication for the subsequent code as to where to place the dots. Also, you only apply the new logic when mask is not nomask leading to spurious display differences between arrays with mask=nomask and mask=ones(n).

abalkin · 2016-05-11T20:55:27Z

The logic for 1-D arrays is odd. A length 1001 array is contracted:

In [98]: print(np.arange(1001))
[   0    1    2 ...,  998  999 1000]

but a length 1000 one is printed in full. (I will not paste a screenful here.)

saimn · 2016-05-11T20:59:02Z

@abalkin - The logic is just to reduce the size of the array, but still have enough values to use the same printing logic as before. So if there are enough values, the output should be the same (with the conversion to the object dtype, filling with -- for masked values, and then truncating and adding ... which is done by ndarray). If mask is nomask, there is no need to do all this stuff because the array is printed directly.

abalkin · 2016-05-11T21:02:43Z

Got it, but the constant _print_width logic is probably too simplistic. You need a different value for 1D case.

saimn · 2016-05-11T21:02:50Z

Hmm with _print_width = 1000 and a 1001 length array, I get a data which is fully printed but the mask is truncated ...

edit:

In [13]: np.ma.MaskedArray._print_width =1000
In [14]: a = np.ma.arange(1001)
In [15]: a[:50] = np.ma.masked

In [16]: a.data
Out[16]: array([   0,    1,    2, ...,  998,  999, 1000])

In [17]: a.mask
Out[17]: array([ True,  True,  True, ..., False, False, False], dtype=bool)

But then printing a shows the full data.

abalkin · 2016-05-11T21:10:44Z

Truncation logic may also be dtype specific.

charris · 2016-05-14T16:56:54Z

Is there a fix for this appropriate for 1.11.1?

saimn · 2016-05-17T19:59:22Z

It seems that Numpy starts to truncate the array when it has more than 1000 elements, so for the 1D case, setting _print_width to something greater than 1000 should do the job.
But for 2D and more, the current value is fine. What would be the best way to distinguish the 2 cases ? Adding another _print_width_1d variable ?

charris · 2016-05-22T02:04:58Z

@saimn That sounds good as a quick fix. Long term, I think we should figure out a better way of printing masked arrays.

Ref numpy#7621. numpy#6748 added `np.ma.MaskedArray._print_width` which is used to cut a masked array before printing it (to save memory and cpu time during the conversion to the object dtype). But this doesn't work correctly for 1D arrays, for which up to 1000 values can be printed before cutting the array. So this commit adds a new class variable `_print_width_1d` to handle the 1D case separately.

charris · 2016-05-23T18:19:36Z

Should be fixed by #7658. Closing, but woud be good if folks would give the fix a shot and see if they can cause trouble.

fonnesbeck · 2016-05-25T16:39:15Z

This fixed the issue for me, thanks.

charris added 06 - Regression component: numpy.ma masked arrays labels May 11, 2016

charris added this to the 1.11.1 release milestone May 11, 2016

charris changed the title ~~error in filling mask in masked arrays~~ BUG: error in printing masked arrays May 14, 2016

saimn mentioned this issue May 22, 2016

BUG: fix incorrect printing of 1D masked arrays #7658

Merged

charris mentioned this issue May 23, 2016

Backport 7658, BUG: fix incorrect printing of 1D masked arrays #7665

Merged

charris closed this as completed May 23, 2016

Uh oh!

BUG: error in printing masked arrays #7621

BUG: error in printing masked arrays #7621

Comments

fonnesbeck commented May 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

abalkin commented May 11, 2016

Uh oh!

fonnesbeck commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fonnesbeck commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

charris commented May 11, 2016

Uh oh!

ahaldane commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

saimn commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

saimn commented May 11, 2016

Uh oh!

abalkin commented May 11, 2016

Uh oh!

saimn commented May 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abalkin commented May 11, 2016

Uh oh!

charris commented May 14, 2016

Uh oh!

saimn commented May 17, 2016

Uh oh!

charris commented May 22, 2016

Uh oh!

charris commented May 23, 2016

Uh oh!

fonnesbeck commented May 25, 2016

Uh oh!

fonnesbeck commented May 11, 2016 •

edited

Loading

abalkin commented May 11, 2016 •

edited

Loading

saimn commented May 11, 2016 •

edited

Loading