Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TYP: np.argmin and np.argmax overload changes #28906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lvllvl
Copy link
Contributor

@lvllvl lvllvl commented May 5, 2025

Attempts to close #28641

@jorenham jorenham self-requested a review May 5, 2025 18:28

This comment has been minimized.

Copy link
Member

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why you chose to remove some of the overloads, and why you modified the parameter types and return types of the other ones?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NumPy type-tests are located at numpy/typing/tests/data/reveal (acceptance- / true negatives) and numpy/typing/tests/data/fail (rejection / true positives)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I will add tests in the right location.

I am going to reinsert the ones I removed. I removed them initially out of ignorance. But I wanted to remove this one:

 @overload
 def argmin(
     a: ArrayLike,
     axis: SupportsIndex | None = ...,
     out: None = ...,
     *,
     keepdims: bool = ...,
 ) -> Any: ...

Wouldn't a type Any potentially return a float64? And since we only want to return integers, then shouldn't this be removed?

Thanks for the review, I'll keep working on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that one should stay. It handles the case where it cannot be determined whether it should return a scalar or an array.

Comment on lines +114 to +115
_IndexArray = NDArray[np.signedinteger] | NDArray[np.unsignedinteger] | NDArray[np.bool_] # type alias for argmin / argmax
_OutT = TypeVar("_OutT", bound=_IndexArray) # Type variable, must be assignable to _IndexArray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since _IndexArray is only used once, it would be a bit cleaner to inline it. It's also fine if you want to keep it, but in that case it should then be annotated as TypeAlias.

Also, _IndexArray is currently too restrictive, because it would currently reject valid types like like NDArray[bool_ | int_], NDArray[int8 | uint8], and NDArray[integer].

Additionally, it could help to have the name of the type parameter reflect the restriction on its upper bound. The name shows up in case of type-checker errors, e.g. when someone passes NDArray[np.float64]. And if the user then sees _OutT, it won't be very informative.

So given that, I'd personally probably go for something like:

Suggested change
_IndexArray = NDArray[np.signedinteger] | NDArray[np.unsignedinteger] | NDArray[np.bool_] # type alias for argmin / argmax
_OutT = TypeVar("_OutT", bound=_IndexArray) # Type variable, must be assignable to _IndexArray
_BoolOrIntArrayT = TypeVar("_BoolOrIntArrayT", bound=NDArray[np.integer | np.bool])

But there are probably many other good options too 🤷🏻

Copy link

github-actions bot commented May 7, 2025

Diff from mypy_primer, showing the effect of this PR on type check results on a corpus of open source code:

xarray (https://github.com/pydata/xarray)
+ xarray/core/resample_cftime.py: note: In member "first_items" of class "CFTimeGrouper":
+ xarray/core/resample_cftime.py:155: error: Need type annotation for "first_items"  [var-annotated]

dedupe (https://github.com/dedupeio/dedupe)
+ dedupe/clustering.py:89: error: No overload variant of "max" matches argument types "ndarray[tuple[int, ...], dtype[signedinteger[_64Bit]]]", "int"  [call-overload]
+ dedupe/clustering.py:89: note: Possible overload variants:
+ dedupe/clustering.py:89: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any]] max(SupportsRichComparisonT, SupportsRichComparisonT, /, *_args: SupportsRichComparisonT, key: None = ...) -> SupportsRichComparisonT
+ dedupe/clustering.py:89: note:     def [_T] max(_T, _T, /, *_args: _T, key: Callable[[_T], SupportsDunderLT[Any] | SupportsDunderGT[Any]]) -> _T
+ dedupe/clustering.py:89: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any]] max(Iterable[SupportsRichComparisonT], /, *, key: None = ...) -> SupportsRichComparisonT
+ dedupe/clustering.py:89: note:     def [_T] max(Iterable[_T], /, *, key: Callable[[_T], SupportsDunderLT[Any] | SupportsDunderGT[Any]]) -> _T
+ dedupe/clustering.py:89: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any], _T] max(Iterable[SupportsRichComparisonT], /, *, key: None = ..., default: _T) -> SupportsRichComparisonT | _T
+ dedupe/clustering.py:89: note:     def [_T1, _T2] max(Iterable[_T1], /, *, key: Callable[[_T1], SupportsDunderLT[Any] | SupportsDunderGT[Any]], default: _T2) -> _T1 | _T2

optuna (https://github.com/optuna/optuna)
+ optuna/samplers/_nsgaiii/_elite_population_selection_strategy.py:213: error: Incompatible types in assignment (expression has type "integer[Any]", variable has type "ndarray[Any, Any]")  [assignment]
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: error: No overload variant of "min" matches argument types "int", "ndarray[tuple[int, ...], dtype[signedinteger[_32Bit | _64Bit]]]"  [call-overload]
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note: Possible overload variants:
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any]] min(SupportsRichComparisonT, SupportsRichComparisonT, /, *_args: SupportsRichComparisonT, key: None = ...) -> SupportsRichComparisonT
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [_T] min(_T, _T, /, *_args: _T, key: Callable[[_T], SupportsDunderLT[Any] | SupportsDunderGT[Any]]) -> _T
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any]] min(Iterable[SupportsRichComparisonT], /, *, key: None = ...) -> SupportsRichComparisonT
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [_T] min(Iterable[_T], /, *, key: Callable[[_T], SupportsDunderLT[Any] | SupportsDunderGT[Any]]) -> _T
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [SupportsRichComparisonT: SupportsDunderLT[Any] | SupportsDunderGT[Any], _T] min(Iterable[SupportsRichComparisonT], /, *, key: None = ..., default: _T) -> SupportsRichComparisonT | _T
+ optuna/importance/_ped_anova/scott_parzen_estimator.py:71: note:     def [_T1, _T2] min(Iterable[_T1], /, *, key: Callable[[_T1], SupportsDunderLT[Any] | SupportsDunderGT[Any]], default: _T2) -> _T1 | _T2

static-frame (https://github.com/static-frame/static-frame)
+ static_frame/core/index.py:1371: error: Unused "type: ignore" comment  [unused-ignore]

spark (https://github.com/apache/spark)
+ python/pyspark/ml/linalg/__init__.py:844: error: Incompatible return value type (got "ndarray[tuple[int, ...], dtype[float64]]", expected "float64")  [return-value]
+ python/pyspark/mllib/linalg/__init__.py:964: error: Incompatible return value type (got "ndarray[tuple[int, ...], dtype[float64]]", expected "float64")  [return-value]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TYP: np.argmin overloads could be more precise
2 participants