BUG: numpy.ma.min/max fails for uint and float16 dtypes #27584

fengluoqiuwu · 2024-10-17T08:42:45Z

Delete inf check.
Due to in the doc it says that we should use the output of minimun_fill_value, which is inf as dytype is floatingpoint.
Fixes #27580

…nd max funcs to unsafe

eendebakpt · 2024-10-17T13:41:34Z

@fengluoqiuwu I am not sure this is the right fix (changing the return type of default_fill_value seems like a viable option as well). But could you:

Update the description in the first post to match the commits
Add a unit test for the case that is solved by this PR

Thanks!

fengluoqiuwu · 2024-10-17T14:47:06Z

@eendebakpt

Apologies for the confusion. I updated the fix with another way but forgot to change the description accordingly. I'll ensure the details match next time.

I'll add the unit test now to cover the case solved by this PR. Thanks for the suggestion!

ngoldbaum · 2024-10-17T21:23:03Z

I agree that using unsafe casting here is incorrect.

fengluoqiuwu · 2024-10-18T02:21:01Z

Unsigned int works with the patch below, but np.float16 still fails while np.float32 works as expected. It's a bit puzzling, and I'm currently working on resolving the issue.

fengluoqiuwu · 2024-10-18T03:12:51Z

Considering that masked arrays in NumPy use a mask to ignore certain values during calculations, do we really need to be concerned about the actual fill value within the array when the mask is applied, as properties fill_value will store the value else where? Specifically, does the fill value impact any operations or computations for the masked elements?

It seems that I removed the code handling inf values, and the tests of ma still ran without issues. When using data type which maximum is smaller than the default value, and we can't fill the masked array with default value. For example, it can't fill with 1.e20 to masked array with dtype=np.float16, so what should we actual fill to the masked array if getting rid of inf is necessary?

fengluoqiuwu · 2024-10-18T03:15:56Z

I agree with you. Using unsafe casting here doesn't seem right. It might trigger a runtime warning with dtype=np.float16, whereas the original approach would likely raise an error.

fengluoqiuwu · 2024-10-18T08:35:53Z

As mentioned in the documentation, the masked array should be filled with the result of the specific function. Therefore, I would prefer not to include the inf check, as it could alter the fill value. I will open a new issue to discuss what value should be used when encountering inf or values outside the expected range.

numpy/ma/core.py

eendebakpt · 2024-11-05T13:16:46Z

The PR addresses the issue by using _check_fill_value at the place where the value becomes problematic. But would it perhaps be better to change the fill value?

Currently:

import numpy as np
from numpy.ma.core import default_fill_value, _check_fill_value
value = default_fill_value(np.dtype(np.uint8)) # is 999999. should it be 63?
_check_fill_value(value, np.uint8) # array(63, dtype=uint8)
_check_fill_value(None, np.uint8) # is 999999. should be array(63, dtype=uint8)?

I would expect _check_fill_value(None, dtype) and _check_fill_value(default_fill_value(dtype), dtype) to be the same. And default_fill_value(dtype) to return a fill value that is valid for the specified dtype. Maybe changing these will cause too much trouble elsewhere (I have not checked that).

fengluoqiuwu · 2024-11-05T14:20:39Z

The PR addresses the issue by using _check_fill_value at the place where the value becomes problematic. But would it perhaps be better to change the fill value?

Currently:
import numpy as np
from numpy.ma.core import default_fill_value, _check_fill_value
value = default_fill_value(np.dtype(np.uint8)) # is 999999. should it be 63?
_check_fill_value(value, np.uint8) # array(63, dtype=uint8)
_check_fill_value(None, np.uint8) # is 999999. should be array(63, dtype=uint8)?
I would expect _check_fill_value(None, dtype) and _check_fill_value(default_fill_value(dtype), dtype) to be the same. And default_fill_value(dtype) to return a fill value that is valid for the specified dtype. Maybe changing these will cause too much trouble elsewhere (I have not checked that).

I actually agree with this approach, as the original design causes many issues and bugs. I would prefer to take a more aggressive approach, which is to enforce that the fill value matches the dtype of the array. However, I think it would be better to address these issues along with other design problems in MaskedArray in the same version to avoid making multiple changes to its behavior. (The more aggressive approach would be to refactor the current MaskedArray implementation, as I’ve found that many bugs due to design flaws cannot be easily fixed.)

change default_filler 'u' from 999999 to np.uint64(999999)

b2c4e6a

github-actions bot added the 00 - Bug label Oct 17, 2024

undelete test code

0550e4d

fengluoqiuwu closed this Oct 17, 2024

fengluoqiuwu reopened this Oct 17, 2024

fengluo added 4 commits October 17, 2024 17:34

change docstring from np.int64(999999) to np.uint64(999999)

b213655

change docstring from np.int64(999999) to np.uint64(999999)

e0ef273

cancel change in default_filler and change casting of copyto in min a…

7956ee9

…nd max funcs to unsafe

code line too long

7436253

fengluo added 6 commits October 18, 2024 12:20

add floating check before infinity check, and add test to the BUG.

ee4d636

add missing white space

cfaeac7

add test code

7dfcc3f

delete inf check

5021163

add missing whitespace

5c5da53

Add float16 test support

0c93015

fengluoqiuwu changed the title ~~BUG: Fix bug-27580~~ BUG: numpy.ma.min/max fails for uint and float16 dtypes Oct 19, 2024

fengluoqiuwu and others added 5 commits October 24, 2024 10:50

Merge branch 'numpy:main' into fix-bug-27580

8864f10

parametrize the test

75ee839

parametrize the test

1968672

Merge remote-tracking branch 'origin/fix-bug-27580' into fix-bug-27580

a5b2712

add missing whitespace

80d28ce

seberg reviewed Oct 30, 2024

View reviewed changes

numpy/ma/core.py Show resolved Hide resolved

revert changes and add fill value check

f69efa7

Merge branch 'main' into fix-bug-27580

f60f368

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: numpy.ma.min/max fails for uint and float16 dtypes #27584

BUG: numpy.ma.min/max fails for uint and float16 dtypes #27584

fengluoqiuwu commented Oct 17, 2024 •

edited

Loading

eendebakpt commented Oct 17, 2024

fengluoqiuwu commented Oct 17, 2024 •

edited

Loading

ngoldbaum commented Oct 17, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024 •

edited

Loading

eendebakpt commented Nov 5, 2024

fengluoqiuwu commented Nov 5, 2024

BUG: numpy.ma.min/max fails for uint and float16 dtypes #27584

Are you sure you want to change the base?

BUG: numpy.ma.min/max fails for uint and float16 dtypes #27584

Conversation

fengluoqiuwu commented Oct 17, 2024 • edited Loading

eendebakpt commented Oct 17, 2024

fengluoqiuwu commented Oct 17, 2024 • edited Loading

ngoldbaum commented Oct 17, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024

fengluoqiuwu commented Oct 18, 2024 • edited Loading

eendebakpt commented Nov 5, 2024

fengluoqiuwu commented Nov 5, 2024

fengluoqiuwu commented Oct 17, 2024 •

edited

Loading

fengluoqiuwu commented Oct 17, 2024 •

edited

Loading

fengluoqiuwu commented Oct 18, 2024 •

edited

Loading