Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DISCUSS: About issue of masked array #27588

Open
@fengluoqiuwu

Description

@fengluoqiuwu

While reading the code and addressing some bugs related to numpy/ma, I encountered a few questions:

  1. Filling Values in Masked Arrays:
    Do we actually care about the exact fill value in masked arrays, given that they are masked by other values? If not, I believe I can resolve bug 27580 by simply removing the check for inf.

  2. Default Fill Value for Masked Arrays:
    Currently, the default fill value for masked arrays is defined by default_filler for Python data types. However, Python doesn’t have unsigned integer types, so for np.uint arrays, the default fill value is stored as np.int64(999999). This causes issues in operations like copyto(..., casting='samekind'), as seen in bug 27580 and bug 27269. Should we consider using NumPy data types for the default fill value to ensure that the fill value matches the data type of the array (e.g., using a fill value that corresponds to integers or unsigned integers as appropriate)?

  3. Large Default Fill Values:
    Some default fill values seem quite large, such as 999999 for np.int8 and 1.e20 for np.float16. What would be an appropriate default fill value for masked arrays, particularly for small data types like int8 and float16? (bug25677)

  4. Reviewing copyto in Masked Arrays:
    Should we perform a comprehensive review of copyto functionality for masked arrays? It seems likely that similar bugs could exist due to the same root cause.

  5. Testing for Small Data Types:
    Should we extend the test suite to include small data types (e.g., int8 and float16) to ensure that functions handle these cases correctly?

  6. Checking Method Consistency
    Should we check the consistency of method between (no-masked) masked array and ndarray? There is some difference between methods and behaviors of (no-masked) masked array and ndarray, for example, see bug27258.

  7. Making Standard Clear
    Some methods' standard is not clear. For example, should we auto mask the invalid result? In some function (such as sqrt , std) it does, but in other function (such as median, mean). Something more worse is that in the document some function don't mention it but auto change the mask (sqrt std) , and others do mention it but not change (mean).
    And something more worse is that, some important methods don't have clear explanation both in document and doc string, some of them are really important. For example, __array_wrap__ , most of the callings to ufunc call it, and I think it might be the cause of the bug25635.

Since I'm not sure where to place these questions, I’ve marked this as a discussion for now.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions