DISCUSS: About issue of masked array

While reading the code and addressing some bugs related to `numpy/ma`, I encountered a few questions:

1. **Filling Values in Masked Arrays**:  
   Do we actually care about the exact fill value in masked arrays, given that they are masked by other values? If not, I believe I can resolve [bug 27580](https://github.com/numpy/numpy/issues/27580) by simply removing the check for `inf`.

2. **Default Fill Value for Masked Arrays**:  
   Currently, the default fill value for masked arrays is defined by [`default_filler`](https://github.com/numpy/numpy/blob/bdc8d4e03181deac5280166aec4188318050570d/numpy/ma/core.py#L167C1-L183C63) for Python data types. However, Python doesn’t have unsigned integer types, so for `np.uint` arrays, the default fill value is stored as `np.int64(999999)`. This causes issues in operations like `copyto(..., casting='samekind')`, as seen in [bug 27580](https://github.com/numpy/numpy/issues/27580) and [bug 27269](https://github.com/numpy/numpy/issues/27269). Should we consider using NumPy data types for the default fill value to ensure that the fill value matches the data type of the array (e.g., using a fill value that corresponds to integers or unsigned integers as appropriate)?

3. **Large Default Fill Values**:  
   Some default fill values seem quite large, such as `999999` for `np.int8` and `1.e20` for `np.float16`. What would be an appropriate default fill value for masked arrays, particularly for small data types like `int8` and `float16`? ([bug25677](https://github.com/numpy/numpy/issues/25677))

4. **Reviewing `copyto` in Masked Arrays**:  
   Should we perform a comprehensive review of `copyto` functionality for masked arrays? It seems likely that similar bugs could exist due to the same root cause.

5. **Testing for Small Data Types**:  
   Should we extend the test suite to include small data types (e.g., `int8` and `float16`) to ensure that functions handle these cases correctly?

6. **Checking Method Consistency**
   Should we check the consistency of method between (no-masked) `masked array` and `ndarray`? There is some difference between methods and behaviors of (no-masked) `masked array` and `ndarray`, for example, see [bug27258](https://github.com/numpy/numpy/issues/27258).

7. **Making Standard Clear**
   Some methods' standard is not clear. For example, should we auto mask the invalid result? In some function (such as `sqrt` , `std`) it does, but in other function (such as `median`, `mean`). Something more worse is that in the document some function don't mention it but auto change the mask (`sqrt` [`std`](https://numpy.org/devdocs/reference/generated/numpy.ma.masked_array.std.html#numpy.ma.masked_array.std)) , and others do mention it but not change ([`mean`](https://numpy.org/devdocs/reference/generated/numpy.ma.mean.html#numpy.ma.mean)).
   And something more worse is that, some important methods don't have clear explanation both in document and doc string, some of them are really important. For example, `__array_wrap__` , most of the callings to ufunc call it, and I think it might be the cause of the [bug25635](https://github.com/numpy/numpy/issues/25635).

Since I'm not sure where to place these questions, I’ve marked this as a discussion for now.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DISCUSS: About issue of masked array #27588

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DISCUSS: About issue of masked array #27588

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions