-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Description
While reading the code and addressing some bugs related to numpy/ma, I encountered a few questions:
-
Filling Values in Masked Arrays:
Do we actually care about the exact fill value in masked arrays, given that they are masked by other values? If not, I believe I can resolve bug 27580 by simply removing the check forinf. -
Default Fill Value for Masked Arrays:
Currently, the default fill value for masked arrays is defined bydefault_fillerfor Python data types. However, Python doesn’t have unsigned integer types, so fornp.uintarrays, the default fill value is stored asnp.int64(999999). This causes issues in operations likecopyto(..., casting='samekind'), as seen in bug 27580 and bug 27269. Should we consider using NumPy data types for the default fill value to ensure that the fill value matches the data type of the array (e.g., using a fill value that corresponds to integers or unsigned integers as appropriate)? -
Large Default Fill Values:
Some default fill values seem quite large, such as999999fornp.int8and1.e20fornp.float16. What would be an appropriate default fill value for masked arrays, particularly for small data types likeint8andfloat16? (bug25677) -
Reviewing
copytoin Masked Arrays:
Should we perform a comprehensive review ofcopytofunctionality for masked arrays? It seems likely that similar bugs could exist due to the same root cause. -
Testing for Small Data Types:
Should we extend the test suite to include small data types (e.g.,int8andfloat16) to ensure that functions handle these cases correctly? -
Checking Method Consistency
Should we check the consistency of method between (no-masked)masked arrayandndarray? There is some difference between methods and behaviors of (no-masked)masked arrayandndarray, for example, see bug27258. -
Making Standard Clear
Some methods' standard is not clear. For example, should we auto mask the invalid result? In some function (such assqrt,std) it does, but in other function (such asmedian,mean). Something more worse is that in the document some function don't mention it but auto change the mask (sqrtstd) , and others do mention it but not change (mean).
And something more worse is that, some important methods don't have clear explanation both in document and doc string, some of them are really important. For example,__array_wrap__, most of the callings to ufunc call it, and I think it might be the cause of the bug25635.
Since I'm not sure where to place these questions, I’ve marked this as a discussion for now.