Add functionality to create a lazy masked array where equal to a given value #181

djkirkham · 2016-07-06T13:28:28Z

This addition is designed to mimic the NumPy function numpy.ma.masked_equal()

…n value

marqh · 2016-07-22T09:33:59Z

this looks like a useful and well coded change

does anyone have objections to me merging this?
@bjlittle @pelson @pp-mo

pp-mo · 2016-07-22T10:40:35Z

thanks @marqh

I don't know enough to be sure about this, but with an eye to future addtions I'm wondering if a two-step solution would be more flexible : That is, provide a two-argument constructor that combines a data-array with a mask-array, treating them both as independent biggus calculations.
Then we could define the required function as something like :

def masked_equal(array, value):
    return masked_array(array, equal(array, value))

Where this masked_array is a biggus equivalent of numpy.ma.array(data, mask=mask).
It is probably a new array class, something like 'DataAndMask(_Elementwise)', similar to the '_ufunc_wrapper' derivations.

The actual implementation would be basically much as already given in @djkirkham existing proposal.
However, I'm suggesting the calculation is purely elementwise, so to do what we want here (compare with a constant) it uses broadcasting -- but I think that comes for free with the existing implementation.

The benefit is, it's then dead simple to provide similar new operators such as masked_less.
Assuming we get around to needing such additions, this way could create a lot less clutter.
In direct usage, it can also handle much more complex cases,
like masked_array(a, mask=(b < c * 0.5)).

Does anyone think that is a good idea ?

djkirkham · 2016-07-22T11:34:48Z

It seems to me that one problem with a general masked array class is that if the mask it based on the underlying array then when the data is loaded it must be loaded twice - once for the data and once for the mask. Is there a way around it other than implementing special cases like this one? Is it even a serious problem?

pp-mo · 2016-07-22T11:51:36Z

@djkirkham when the data is loaded it must be loaded twice

Ooh, good point !

I tend to forget that biggus is not like a language with lazy evaluation : they tend to avoid re-calculation.
I think it was originally thought we would eventually address that in the 'engines' code, but it has never seemed much of a priority.

I do know we haven't bothered to avoid some duplicate evaluations in the StaGE configurations, as it was thought a minor problem : But I think that is mostly for orographies, where they are smaller (lower-dimensional) than the other data they are combined with.

I'm not sure what I think about the practicality of this now - it depends on intended usage.

marqh · 2016-07-22T12:08:40Z

i'm looking to cut a biggus release today.

On that basis, this either goes in as is, or it waits for further implementation.

My view is that this is useful, and that if a more flexible implementation arrives later, then this can become a call to that whilst maintaining the API call. On that Basis, my vote is for merge

@pp-mo @djkirkham please may you indicate whether you agree or disagree with me on this?
thank you
mark

pp-mo · 2016-07-22T12:15:41Z

@marqh agree or disagree

!agree! 👍

djkirkham · 2016-07-22T12:21:42Z

@marq Sounds sensible, I agree.

pelson · 2016-08-04T04:56:30Z

I'm concerned about the results of this change:

In [3]: import biggus

In [4]: import numpy as np

In [5]: a = np.arange(10) % 2

In [6]: print(a)
[0 1 0 1 0 1 0 1 0 1]

In [7]: n_m = np.ma.masked_equal(a, 1)

In [8]: b_m = biggus.masked_equal(a, 1)

In [9]: print(n_m.mean(), biggus.mean(b_m, axis=0).ndarray())
(0.0, array(0.5))

Is there a reason you didn't implement ndarray @djkirkham?

djkirkham · 2016-08-08T07:56:51Z

Good spot, but I don't think it's a bug introduced by this change. Rather, it's a problem with biggus.mean(). The result is the same if line 8 is replaced with b_m = biggus.NumpyArrayAdapter(n_m)

djkirkham force-pushed the mask-value-array branch from 0eb7dda to 079b013 Compare July 6, 2016 13:29

Add functionality to create a lazy masked array where equal to a give…

29059a5

…n value

djkirkham force-pushed the mask-value-array branch from 079b013 to 29059a5 Compare July 6, 2016 13:30

marqh merged commit 21895c8 into SciTools:master Jul 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add functionality to create a lazy masked array where equal to a given value #181

Add functionality to create a lazy masked array where equal to a given value #181

Uh oh!

djkirkham commented Jul 6, 2016

Uh oh!

marqh commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016 •

edited

Loading

Uh oh!

djkirkham commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016 •

edited

Loading

Uh oh!

marqh commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016

Uh oh!

djkirkham commented Jul 22, 2016

Uh oh!

pelson commented Aug 4, 2016

Uh oh!

djkirkham commented Aug 8, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add functionality to create a lazy masked array where equal to a given value #181

Add functionality to create a lazy masked array where equal to a given value #181

Uh oh!

Conversation

djkirkham commented Jul 6, 2016

Uh oh!

marqh commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djkirkham commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marqh commented Jul 22, 2016

Uh oh!

pp-mo commented Jul 22, 2016

Uh oh!

djkirkham commented Jul 22, 2016

Uh oh!

pelson commented Aug 4, 2016

Uh oh!

djkirkham commented Aug 8, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pp-mo commented Jul 22, 2016 •

edited

Loading

pp-mo commented Jul 22, 2016 •

edited

Loading