-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Getting NEP 50 behavior in the array API compat library #22341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A context manager, and more importantly the corresponding contextvar, would be the only way that I can see right now. I always assumed we go with point 4. though! This is because it actually seems the easier/more useful thing for most likely users? Currently, I assume any adopting library will already be used a lot with NumPy input. Enabling the flag would thus modify behavior of the library. It seems easier to me from a library perspective if the behavior change happens when updating to NumPy 2.0, rather than earlier. Newly written libraries might be annoyed by this, I admit. Although, then it might be best to require the library to use the context-manager (or some option) to actually get the new behavior. |
The point is that libraries that write against the array API will effectively get the new behavior whenever they run against an array library that doesn't have the NumPy legacy behavior. So they will have to deal with those promotion rules anyway if they want things to be consistent across array libraries. |
Right, but is that actually something they want or need? If you really wanted that fine grained consistency, you cannot use plain NumPy arrays, but would have to always wrap them consistently (or flag them). This is of course a pattern that can be done and was mentioned many times: wrap, use a (less minimal) That pattern still seems impractical to me, but I still don't want to discourage you from exploring it if that is what you consider best. In this case I still think that trying to get promotion right here is making the perfect the enemy of the good: To me it seems like a huge amount of hassle for something that practically nobody will even notice. |
The only true solution would be wrapping (for a library, end-users you might just tell to upgrade NumPy and opt-in to NEP 50). Wrapping would require a clear pattern that we do not have for 1.5+ years. And that may even require formalizing Call these deficiencies NumPy bugs if you like and hope that downstream pushes for fixing them and adopting NEP 50 (e.g. this question). I would much prefer to spend our limited brain cycles on pushing for NEP 50 rather than inventing a compatibility layer that I don't think anyone wants to keep in the mid-term?! If we cannot do without that compatibility layer in the long-term (say because NumPy returns scalar and we don't know when we might change that), then that is a different matter. In that case, I could try to look into "marking" arrays and you would need to look into how that compatibility layer should look like (i.e. remove the need for you to fully wrap the NumPy array). |
@thomasjpfan were type promotion differences a concern with your scikit-learn array API work? I'm happy to leave this as is for now, until it comes up for a real world use case. |
For using Array API, I did not hit any issues with value based casting. Most of my typing related changes had to do with the strictness of casting. For example, one can not divide an integer array by another integer with the Array API: import numpy.array_api as xp
X = xp.asarray([1, 2, 3])
Y = X / 3 @Micky774 has a PR that experimented with "turning on NEP 50" and the changes to scikit-learn were small: scikit-learn/scikit-learn#23644 |
So it sounds like many libraries will already be implementing NEP 50 support concurrently with array API support, so it may not be necessary to worry about it so much in the compat library.
The compat library won't be strict about this, but that's why it's still useful to test against the minimal numpy.array_api implementation, because there's no guarantee that other libraries will allow integer division (for example) when it isn't required by the spec. |
Going to close this for now, I will note that I hope to have We can reopen if the decision to ignore difference needs to be reconsidered. |
Proposed new feature or change:
In #21626 NEP 50 is implemented as opt-in with either a global setting or a context manager. I am working on creating an array API compatibility library (see WIP at https://github.com/data-apis/numpy-array-api-compat). Unlike numpy.array_api, this compatibility library is separate from numpy, so that it can be updated independently. It also aims to be usable for downstream libraries implementing against the array API. So in particular, it is not a strict implementation like
numpy.array_api
. For instance, it will not raise errors for dtype combinations that are not required by the spec but are allowed by NumPy (see also https://numpy.org/doc/stable/reference/array_api.html).For this library, I'd like to keep the wrapping to a minimum, and in particular, I'd like to avoid creating a wrapper class for arrays like was done in
numpy.array_api
, and instead just usenp.ndarray
directly. Since we aren't going for strictness, this isn't a problem, but one issue is type promotion. The spec requires no value-based type promotion, along the lines of NEP 50.The question then is how can we achieve this, so that users of the compatibility library will get something that looks like NumPy but follows the array API specification. Neither of the implemented solutions looks very good for this. We could set the global flag, but this could break other libraries in the same process that use NumPy outside of the array API. In general, it seems like a bad practice for a library to set a global flag. The context manager doesn't seem to be helpful, since most instances of type promotion issues come from operators. If a library (like scikit-learn) has some code like
Then the only way it can make
a + b
do the right type promotion is for scikit-learn itself to add the context manager. But it's annoying for every implementing code to do this.Ideally, we'd be able to set a
promotion_state = 'weak'
flag on arrays and scalars which get carried around by subsequent operations. I can then wrapasarray
and every other array API function that creates arrays in the compat library to set this flag, so that any (array API compatible) usage of those arrays automatically gets the correct type promotion behavior.I'm not sure if this is feasible. If anyone has any other suggestions how we could achieve this, let me know. The options I can think of are
promotion_state
flag to ndarray/scalars.numpy.array_api
. I'd prefer to avoid this but we can consider it if it's unavoidable.The text was updated successfully, but these errors were encountered: