MAINT: Remove unsafe unions and ABCs from return-annotations #18885

BvB93 · 2021-05-02T12:00:20Z

Per the title, this PR removes unsafe unions and abstract baseclasses from the return annotations,
e.g. functions that currently have one of the following patterns.

from __future__ import annotations
from typing import Any, Sequence, Any as A, Any as B

def func1(*args: Any, **kwargs: Any) -> A | B:
    pass

def func2(*args: Any, **kwargs: Any) -> Sequence[Any]:
    pass

The Problem

The problem with returning a Union (or, almost equivalently, an abstract-ish baseclass such as generic)
is that any and all operations performed on a union must be compatible with all of its members.
For example, operations that are exclusive to either np.float64 and np.ndarray are thus not allowed
to be executed by np.float64 | np.ndarray, unless the union is narrowed down via an explicit isinstance
check a priori.

While returning a union thus adds some a form of simplicity to the annotations (as we don't have to
distinguish between 0D and ND array-likes), the Union type is simply not suited for what we're trying
to describe here (xref python/mypy#1693). This would be a different story
for a hypothetical UnsafeUnion type, one where operations must be compatible with any member
of the union. Such type does not exist though.

from __future__ import annotations
from typing import Any
import numpy as np

array: np.ndarray[Any, np.dtype[np.float64]]
out = np.isneginf(array)

if TYPE_CHECKING:
    # note: Revealed type is 'Union[numpy.bool_, numpy.ndarray[Any, numpy.dtype[numpy.bool_]]]'
    reveal_type(out)

for i in out:
    # error: Item "bool_" of "Union[bool_, ndarray[Any, dtype[bool_]]]" has no attribute "__iter__" (not iterable)
    print(i)

# Only now will things
if isinstance(out, np.ndarray):
    for i in out:
        print(i)
else:
    print(out)

The Solution

The solutions implemented herein fall in either one of the following two categories:

Simply set the return type to Any. This is a simple, but non-ideal solution, as it removes any type
safety for objects returned by aforementioned functions. Nevertheless, there is a group of functions
that still needs to-be updated for dtype-support anyway (e.g. those in np.core.fromnumeric),
so setting their return to Any is by no means the worst thing that can happen.
A second group of functions is those where the use of Any is simply a necasity, as the output type is.
For example, determined by the value of string literals (see np.einsum). As a silver lining: the einsum
problem in particular does seem like it can be resolved with relative ease via a future mypy plugin.
Add additional an additional overload for 0D array-likes. This is the more thorough and permanent fix
for the unsafe-union issue; it has been applied to the more recently annotated modules such as
np.lib.ufunclike.
There is however one important caveat here: as we currently lack shape-support (Typing support for shapes #16544) it is
currently impossible to distinguish between 0D and ND ndarrays, and thus we are unable to describe
the 0D-to-scalar casting that numpy aggressively performs on 0D arrays. While this should change
once PEP 646 is live, in the mean time users will have to settle for a typing.cast call or a # type: ignore
comment if it is known in advance that 0D-to-scalar cast will be performed.

BvB93 · 2021-05-02T12:02:19Z

numpy/typing/__init__.py

+0D arrays
+~~~~~~~~~
+
+During runtime numpy aggressively casts any passed 0D arrays into their
+corresponding `~numpy.generic` instance. Until the introduction of shape
+typing (see :pep:`646`) it is unfortunately not possible to make the
+necessary distinction between 0D and >0D arrays. While thus not strictly
+correct, all operations are that can potentially perform a 0D-array -> scalar
+cast are currently annotated as exclusively returning an `ndarray`.
+
+If it is known in advance that an operation _will_ perform a
+0D-array -> scalar cast, then one can consider manually remedying the
+situation with either `typing.cast` or a ``# type: ignore`` comment.
+


TLDR: The new return annotations will now just be somewhat inconvenient for 0D arrays, rather than 0D and ND arrays.

Also a second question: are there any parts of the numpy documentation that describe 0D arrays and/or numpy's aggressive 0D-to-scalar casting? If so, then it might be useful the place a link here.

The casts are scattered through the C code as PyArray_Return. The policy, such as it is, is documented as

.. c:function:: PyObject* PyArray_Return(PyArrayObject* arr) This function steals a reference to *arr*. This function checks to see if *arr* is a 0-dimensional array and, if so, returns the appropriate array scalar. It should be used whenever 0-dimensional arrays could be returned to Python.

BvB93 · 2021-05-02T12:05:36Z

A question: could we exclude the numpy/typing/tests/data directory from the lint tests?
For those test data we're reliant on single-line comments that are frequently longer than the prescribed line-length limit.

Edit: Done as of 15420c8 and 6fa34d4.

With the current tests system we cannot reasonably enforce E501 (maximum line length)

E704 (multiple statements on one line (def)) is a style rule not prescribed by PEP8. Furthermore, because it demands a function body it is needlessly inconvenient for static type checking, i.e. situation where this is no function body.

BvB93 · 2021-05-02T19:29:17Z

Any idea why pycodestyle is still checking numpy/typing/tests/data while it has just been added to exclude?

charris · 2021-05-04T17:43:43Z

Any idea why pycodestyle is still checking numpy/typing/tests/data

Maybe it is picking up lint_diff.ini before the PR? Or maybe it doesn't work :) We won't know if it is working for numpy/__config__.py until we try to change it.

charris · 2021-05-04T17:44:17Z

Thanks Bas. The long lines are no worse than before.

BvB93 · 2021-05-04T17:51:31Z

Maybe it is picking up lint_diff.ini before the PR? Or maybe it doesn't work :) We won't know if it is working for numpy/__config__.py until we try to change it.

Running pycodestyle locally with the config file does seem to work, so I suspect (and hope) it's the former.

Bas van Beek added 10 commits April 30, 2021 22:09

MAINT: Remove unsafe unions from np.core.fromnumeric

9d02316

MAINT: Remove unsafe unions from np.core.function_base

b1eaa40

MAINT: Remove unsafe unions from np.core.numeric

6698265

MAINT: Remove unsafe unions from np.lib.index_tricks

44c3e1f

MAINT: Remove unsafe unions from np.lib.ufunclike

5fad839

MAINT: Remove unsafe unions from np.typing._callable

a63315a

MAINT: Remove unsafe unions from np

235e4f3

MAINT: Remove unsafe unions from np.core.einsumfunc

3888fa8

MAINT: Remove the np.typing._ArrayOrScalar type-alias

a90cbc7

DOC: Add a segment to the numpy.typing docs about 0D arrays

0a045bb

BvB93 added 03 - Maintenance 09 - Backport-Candidate PRs tagged should be backported 41 - Static typing labels May 2, 2021

BvB93 commented May 2, 2021

View reviewed changes

Bas van Beek added 2 commits May 2, 2021 21:10

TST: Ignore lint-checking in the numpy/typing/tests/data directory

15420c8

With the current tests system we cannot reasonably enforce E501 (maximum line length)

TST: Ignore the E704 pycodestyle error code

6fa34d4

E704 (multiple statements on one line (def)) is a style rule not prescribed by PEP8. Furthermore, because it demands a function body it is needlessly inconvenient for static type checking, i.e. situation where this is no function body.

BvB93 mentioned this pull request May 2, 2021

np.clip typing is not as specific as ideal; existing code doesn't type-check #18305

Closed

charris merged commit 4d753a0 into numpy:main May 4, 2021

BvB93 deleted the unsafe branch May 4, 2021 17:52

BvB93 mentioned this pull request May 5, 2021

MAINT: Remove unsafe unions and ABCs from return-annotations #18915

Merged

charris removed the 09 - Backport-Candidate PRs tagged should be backported label May 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT: Remove unsafe unions and ABCs from return-annotations #18885

MAINT: Remove unsafe unions and ABCs from return-annotations #18885

Uh oh!

BvB93 commented May 2, 2021

Uh oh!

BvB93 May 2, 2021

Uh oh!

BvB93 May 2, 2021

Uh oh!

charris May 4, 2021

Uh oh!

BvB93 commented May 2, 2021 •

edited

Loading

Uh oh!

BvB93 commented May 2, 2021

Uh oh!

charris commented May 4, 2021

Uh oh!

charris commented May 4, 2021

Uh oh!

BvB93 commented May 4, 2021

Uh oh!

Uh oh!

Uh oh!

MAINT: Remove unsafe unions and ABCs from return-annotations #18885

MAINT: Remove unsafe unions and ABCs from return-annotations #18885

Uh oh!

Conversation

BvB93 commented May 2, 2021

The Problem

The Solution

Uh oh!

BvB93 May 2, 2021

Choose a reason for hiding this comment

Uh oh!

BvB93 May 2, 2021

Choose a reason for hiding this comment

Uh oh!

charris May 4, 2021

Choose a reason for hiding this comment

Uh oh!

BvB93 commented May 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BvB93 commented May 2, 2021

Uh oh!

charris commented May 4, 2021

Uh oh!

charris commented May 4, 2021

Uh oh!

BvB93 commented May 4, 2021

Uh oh!

Uh oh!

BvB93 commented May 2, 2021 •

edited

Loading