-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Type hinting / annotation (PEP 484) for ndarray, dtype, and ufunc #7370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think anyone's thought about it. Perhaps you would like to? :-) I'm also going to suggest that if you want to followup on this that we close the gh issue and move the discussion to the mailing list, since it's better suited to open-ended design discussions. |
After getting this answer on SO, I've decided to close the issue. |
To be clear, we don't actually have any objection to supporting cool new python features or anything (rather the opposite); it's just that we're a volunteer run project without many resources, so stuff only happens if someone who's interested steps up to do it. The mailing list is usually the best place if you're trying to start working on something or hoping to recruit some other interested folks to help. |
Thanks, @njsmith. I decided to start here because of the more orderly issue-tracking, as opposed to an unstructured mailing list (I was looking for a 'feature request' tag, among other features...) Since the guy who answered me on SO got back to me with a viable solution, I decided to leave the matter. Thanks, again! |
hello guys! I was just kindly wondering if there had been any progress on this issue. Thanks. |
There is some discussion about it on the mailing list here. |
I'm reopening this issue for those who are interested in discussing it further. I think this would certainly be desirable for NumPy, but there are indeed a few tricky aspects of the NumPy API for typing to sort through, such as how NumPy currently accepts arbitrary objects in the |
Some good work is being done here: https://github.com/machinalis/mypy-data There's discussion about whether to push the work upstream to numpy or typeshed: machinalis/mypy-data#16 |
CC @mrocklin |
This really would be a great addition to NumPy. What would be the next steps to push this up to typeshed or NumPy? Even an incomplete stub would be useful and I'm happy to help with a bit of direction? |
@henryJack The best place to start would probably be tooling: figure out how we can integrate basic type annotations into the NumPy repository (and ideally test them) in a way that works with mypy and supports adding them incrementally. Then, start with extremely minimal annotations and we can go from there. In particular, I would skip dtype annotations for now since we don't have a good way to specify them (i.e., only do If it's helpful, I have an alternative version of annotations that I've written for use at Google and could open source. But we have our own unique build system and do type checking with pytype, so there would likely be quirks porting it to upstream. |
I suppose the only way to test annotations to actually run mypy on sample code snippets and check the output? Would it be better to have the annotations integrated with the code or as separate stubs? I suppose we should also learn from dropbox and pandas that we should start with the leaves of the codebase versus core data structures? |
@shoyer |
Integrated with the code would be lovely, but I don't think it's feasible for NumPy yet. Even with the comment string version of type annotations, we would need to import from Also, most of the core data structures and functions (things like
Yes, I think that would be enough for external code. But how does mypy handle libraries with incomplete type annotations? If possible, we might annotate |
I'm curious, what is the type of What is the type of |
I think @kjyv has taken a stab at defining those.
|
It looks like types are parametrized, is this with their dtype? Is it also feasible to parametrize with their dimension or shape? How much sophistication does Python's typing module support? |
Yea they are parameterized by their dtype. I'm no expert on the typing module but I think you could just have the ndarray type inherit |
Can one use numpy dtypes in the dtype parameter or can this only be typing
module types?
Also it's odd that numpy.empty returns an array of type Any. I suspect
it's challenging to inter and take the type from the dtype= keyword value?
β¦On Sep 1, 2017 6:42 PM, "Jacques Kvam" ***@***.***> wrote:
Yea they are parameterized by their dtype. I'm no expert on the typing
module but I think you could just have the ndarray type inherit Generic[dtype,
int] to parameterize on ndim. I believe that's what Julia does. I'm not
sure if you could easily parameterize on shape. Nor am I sure of what
benefits that would bring or why it wasn't done that why in the first place.
β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7370 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASszMlYO7iHdoPE_GU--njIYICSVVZ0ks5seIhFgaJpZM4Hm_CR>
.
|
You can use numpy dtypes, we just need to define them. That was done here with I'm not sure, I don't think it's possible. I don't think you can modify the output type based on an argument's value. I think the best we can do is overload the function with all the type specializations we would care about. https://docs.python.org/3/library/typing.html#typing.overload |
Another option might be to introduce some strict-typed aliases, so There's already some precedent for this with the unusual |
@jwkvam OK, so maybe I think
If I understand this correctly, this would imply It would also be awesome to be able to type check shape and dimensionality information, but that I don't think current type checkers are up to the task. We would also need to express signatures involving dimensions. For example, I suppose one approach that would work is to define classes for each non-negative integer, e.g., In TensorFlow, |
@shoyer I see, yea that's disappointing. I was able to hack the following
But I don't think that's leading anywhere... It seems like your example works! It seemed like it worked better without the type constraints. I could test dtypes like
and code
I get
I'm a little confused because if I swap the types:
I get no error. Even though I don't think anything is marked as covariant. Even though the type system isn't as sophisticated as we'd like. Do you think it's still worth it? One benefit I would appreciate is better code completion thru jedi. |
I believe the issue here is that I think this could be avoided if we insist on NumPy scalar types instead of generic Python types for annotations, e.g., This is actually a little easier than I thought because
I'm not quite sure what you were getting at here? |
I just tried to encode the default value of dtype in the stub. They did that in the mypy-data repo.
from https://github.com/kjyv/mypy-data/blob/master/numpy-mypy/numpy/__init__.pyi#L523 Following your example, I wasn't able to get mypy to work with a default argument for dtype. I tried |
I think # totally untested!
D = TypeVar('D', bound=np.generic)
class dtype(Generic[D]):
@property
def type(self) -> Type[D]: ...
class ndarray(Generic[D]):
@property
def dtype(self) -> dtype[D]: ...
DtypeLike = Union[dtype[D], D] # both are coercible to a dtype
ShapeLike = Tuple[int, ...]
def empty(shape: ShapeLike, dtype: DtypeLike[D] = np.float64) -> ndarray[D]: ... |
Hey community, |
Funny you should mention Lean, I've been working with it solidly for the last few months. While interesting from its own right, my impression is that the heavy dependent-typing used by lean would be a significant challenge for mypy to adopt, and arguably not a worthwhile one - at a certain point, these things are better as language features. For the case of numpy, there are plenty of weaker type systems which are good enough role models. |
Have we looked at using PEP 593 to improve numpy typing? For instance we could use |
For anyone still following along here: one blocker for this has been variadic generics, which we're trying to make some progress on with PEP 646, which is currently in review by the Python steering council. Some other links that might be of interest are:
This is definitely one option, and it'd be pretty cool to see a runtime checker which employed these kinds of annotations. The reason I'm personally gunning for the approach suggested in PEP 646 is that it would allow existing tooling like existing static type checkers to verify the kinds of typing things we care about, with (relatively) little extra effort. (OK, we'd have to implement support for 646 in e.g. Mypy, but that's probably simpler than writing a static analysis tool from scratch.) |
Another quick update: Pradeep Kumar Srinivasan and I will be giving a talk on the approach we've been experimenting with over the past 6 months at the PyCon 2021 Typing Summit next week, Catching Tensor Shape Errors Using the Type Checker: https://us.pycon.org/2021/summits/typing/ We'll be discussing how it works, what it looks like in practice, and a few of its current limitations. Hope to see you there! |
Hi all, Some time ago, back in #17719, dtype support was introduced for >>> import numpy as np
>>> import numpy.typing as npt
>>> print(npt.NDArray)
numpy.ndarray[typing.Any, numpy.dtype[~ScalarType]]
>>> print(npt.NDArray[np.float64])
numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
>>> NDArrayInt = npt.NDArray[np.int_]
>>> a: NDArrayInt = np.arange(10)
>>> def func(a: npt.ArrayLike) -> npt.NDArray[Any]:
... return np.array(a) |
@BvB93 This is awesome. I've been using your change since numpy 1.21 came out. Are you planning on adding runtime subscripting like |
@NeilGirdhar excellent idea. As a workaround in the meantime, you could try using |
Is it possible to annotate structured arrays and if so how? I tried |
I believe that as of Numpy 1.21 it is not yet possible. |
Unfortunately not, and I very much doubt that list-of-tuples-syntax will ever be something that mypy will understand (not without some serious plugin magic, at least). As for structured arrays in general, there two main challenges here:
|
@NeilGirdhar there is currently a PR up for making The hope is to wrap things up before the next 1.22 release. |
@NeilGirdhar The workaround I've been using in that case is to put quotes around the subscripted numpy type:
|
Sorry for the random reply but is this now supported now that we are beyond1.22? If so, the numpy typing documents have not made it clear, in the dev or in the stable release |
Is there an update on this? This issue is referenced on Stack Overflow about |
Almost all of NumPy has type annotations that are compatible with PEP 484: |
For those who were following this issue and want to know when more type hint features will be available, here are some related issues: |
Feature request: Organic support for PEP 484 with Numpy data structures.
original SO question
The text was updated successfully, but these errors were encountered: