Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Feb 17, 2023. It is now read-only.

Conversation

@djkirkham
Copy link
Contributor

This addition is designed to mimic the NumPy function numpy.ma.masked_equal()

@marqh
Copy link
Member

marqh commented Jul 22, 2016

this looks like a useful and well coded change

does anyone have objections to me merging this?
@bjlittle @pelson @pp-mo

@pp-mo
Copy link
Member

pp-mo commented Jul 22, 2016

thanks @marqh

I don't know enough to be sure about this, but with an eye to future addtions I'm wondering if a two-step solution would be more flexible : That is, provide a two-argument constructor that combines a data-array with a mask-array, treating them both as independent biggus calculations.
Then we could define the required function as something like :

def masked_equal(array, value):
    return masked_array(array, equal(array, value))

Where this masked_array is a biggus equivalent of numpy.ma.array(data, mask=mask).
It is probably a new array class, something like 'DataAndMask(_Elementwise)', similar to the '_ufunc_wrapper' derivations.

The actual implementation would be basically much as already given in @djkirkham existing proposal.
However, I'm suggesting the calculation is purely elementwise, so to do what we want here (compare with a constant) it uses broadcasting -- but I think that comes for free with the existing implementation.

The benefit is, it's then dead simple to provide similar new operators such as masked_less.
Assuming we get around to needing such additions, this way could create a lot less clutter.
In direct usage, it can also handle much more complex cases,
like masked_array(a, mask=(b < c * 0.5)).

Does anyone think that is a good idea ?

@djkirkham
Copy link
Contributor Author

It seems to me that one problem with a general masked array class is that if the mask it based on the underlying array then when the data is loaded it must be loaded twice - once for the data and once for the mask. Is there a way around it other than implementing special cases like this one? Is it even a serious problem?

@pp-mo
Copy link
Member

pp-mo commented Jul 22, 2016

@djkirkham when the data is loaded it must be loaded twice

Ooh, good point !

I tend to forget that biggus is not like a language with lazy evaluation : they tend to avoid re-calculation.
I think it was originally thought we would eventually address that in the 'engines' code, but it has never seemed much of a priority.

I do know we haven't bothered to avoid some duplicate evaluations in the StaGE configurations, as it was thought a minor problem : But I think that is mostly for orographies, where they are smaller (lower-dimensional) than the other data they are combined with.

I'm not sure what I think about the practicality of this now - it depends on intended usage.

@marqh
Copy link
Member

marqh commented Jul 22, 2016

i'm looking to cut a biggus release today.

On that basis, this either goes in as is, or it waits for further implementation.

My view is that this is useful, and that if a more flexible implementation arrives later, then this can become a call to that whilst maintaining the API call. On that Basis, my vote is for merge

@pp-mo @djkirkham please may you indicate whether you agree or disagree with me on this?
thank you
mark

@pp-mo
Copy link
Member

pp-mo commented Jul 22, 2016

@marqh agree or disagree

!agree! 👍

@djkirkham
Copy link
Contributor Author

@marq Sounds sensible, I agree.

@marqh marqh merged commit 21895c8 into SciTools:master Jul 22, 2016
@pelson
Copy link
Member

pelson commented Aug 4, 2016

I'm concerned about the results of this change:

In [3]: import biggus

In [4]: import numpy as np

In [5]: a = np.arange(10) % 2

In [6]: print(a)
[0 1 0 1 0 1 0 1 0 1]

In [7]: n_m = np.ma.masked_equal(a, 1)

In [8]: b_m = biggus.masked_equal(a, 1)

In [9]: print(n_m.mean(), biggus.mean(b_m, axis=0).ndarray())
(0.0, array(0.5))

Is there a reason you didn't implement ndarray @djkirkham?

@djkirkham
Copy link
Contributor Author

Good spot, but I don't think it's a bug introduced by this change. Rather, it's a problem with biggus.mean(). The result is the same if line 8 is replaced with b_m = biggus.NumpyArrayAdapter(n_m)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants