-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Tracking issue: Python API cleanup for NumPy 2.0 (NEP 52) #23999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In no particular order:
|
I added a 🌋 where I would agree with just removing (which means PRs welcome to get it over with!). |
Do 🌋 need a deprecation period? |
Not in my logic there (which is not quite true because some of them have a deprecation). Basically, I don't want to worry at all if:
That applies for my volcanos, yes. (char is maybe a bit higher impact, but it's also mental load if we try to improve dtypes and was de-facto deprecated long ago. I have added a new EDIT: That said, I don't want to overload users with many changes, so if a function is probably used a bit more and not in the way... maybe we don't need to worry about removing it quickly. |
A couple of other suggestions (might be less clear-cut though):
|
I am OK with making out args kwonly, don't care about doing it without a deprecation but its probably fine (I suspect the main users doing this were trying to safe |
Sorry... that was making start positional only, that makes sense! |
Agreed. There either needs to be a separate package for string ufuncs that depends on a unicode library or numpy itself needs a namespace for string ufuncs and could bundle a lightweight unicode library like utf8proc. The namespace could be That said, it'll be a fair bit of work to make string ufuncs more of a thing and there's no reason to block cleanups of the truly unused stuff in |
* replace `*sctype*` NumPy usage per numpy/numpy#23999 and NEP52 in preparation for NumPy 2.0 * there seem to be straightforward replacements that still pass the testsuite in all cases * `git grep -E -i "sctype"` is clean on this branch (only present in comments for clarity where needed) * there may be better canonical ways to do some of these things in the future, though considerable confusion remains per numpy/numpy#17325 * `UMFPACK`-related changes were not tested locally (wasn't particularly friendly for PyPI-based setup/venv) [skip circle]
regarding the API, one thing that bother me the most is the inconsistency in dealing with pandas DataFrames. I don't have all the infos to source the problem / if there is any history in dealing with it but now applying some numpy function to a padas dataframe give an inconsistent output format, eiher a numpy array or a pandas dataframe. It doesn't seems limited to a specific function, Hence I didn't open a specific issue. Could checking all the function and making them consistent be part of this plan ? |
I noticed today that passing |
Please, please, just propose PRs on the main branch that you want (for removals or simpler changes at least), The only issue I see with That may well be the only real additional code-path, so if anyone asks to keep it, I won't care enough to push it through. But, I am +1 to just remove it. It servers no purpose except to confuse users as far as I can tell; and because of that, there should also be practically no users affected by just deleting it.) |
This is an issue that should be dealt with on the Pandas side. Pandas uses NumPy extensively and has a dependency on it, while NumPy knows nothing about Pandas. Hence there is nothing we can or will do about this in NumPy. The Pandas team already knows what the main pain points are and there are open issues about it on the Pandas issue tracker. So no need to discuss it more here. |
* replace `*sctype*` NumPy usage per numpy/numpy#23999 and NEP52 in preparation for NumPy 2.0 * there seem to be straightforward replacements that still pass the testsuite in all cases * `git grep -E -i "sctype"` is clean on this branch (only present in comments for clarity where needed) * there may be better canonical ways to do some of these things in the future, though considerable confusion remains per numpy/numpy#17325 * `UMFPACK`-related changes were not tested locally (wasn't particularly friendly for PyPI-based setup/venv) [skip circle]
Hi All! As work on this is already underway, as a side-note, I wanted to share that I started my internship at Quansight Labs from the beginning of August, under the supervision of @rgommers and @ngoldbaum, and NEP 52 will be my main focus for the next 3 months! |
@stefanv, @melissawm, @charris and also @rkern just pinging a few to get an opinion on whether we should be more careful. I like removing some things, but am happy to undo also some of the above. In general I am mostly worried about the sum of changes, I am still in favor of most individual change if it only smells like it is not just niche/weird/broken but also in the way of other cleanups (the errstate related removals are a clar example: they unblocked fixing errstate). Things like But for those that are just weird/niche/broken, I would honestly prefer if we keep it in most cases, but:
TBH, thinking back I thought this was the intention of NEP 42: clean up the namespace, but not necessarily but removing/moving functions immediately? |
Since we almost never get a chance to tidy up the API, I'd like to see us to use this opportunity to do so. That said, for users the pain of API transitions is real, as we all learned with Python 3, so whatever we can do to minimize it we should. Keeping hidden API around for a while is okay, I think, as long as it raises deprecation messages that guide the user in how to transition their code, and make it clear when the function will be removed. We don't want to carry those functions around forever, and the migration guide should make it clear that they are no longer officially supported. |
To clarify my thinking: it's that 2.0 is an opportunity for developers to rid themselves of support burden. We do so much backward compatibility work, at significant cost—and it also prevents us from refactoring APIs as necessary. Here, we can afford to say "we're no longer going to be supporting this function, and we're also not going to go through the standard deprecation pathway". Hiding the function and displaying a message to the user is a concession, and one that doesn't hurt us very much. The gist of it is saying: "you've been warned that this is no longer part of 2.0, so don't be surprised when it disappears". I can see the argument for having it simply be removed (it's even easier on the developers, and the user won't accidentally keep using it); but in that case you want to make it abundantly clear in the migration guide what the user is expected to do. The advantage (for the user) of having it in code is that they can run their code, and receive line-by-line guidance on what to change—which feels like an easier process than working through a manual. So, I suppose instead of a warning, an error may be a better form of guidance? Given the above, I'd say I'm fine with (1) or (2) with errors. I don't like any option without a fixed end point. |
Thanks @stefanv. "(2) with errors" seems like a user-friendlier version of (1), so that seems nice. And overlaps with #24306 (comment). Using for example |
A small update: NumPy 2.0 Migration Guide for Python API is now live on main branch: |
It is also available in the devdocs as https://numpy.org/devdocs/numpy_2_0_migration_guide.html |
Is there any desire to move |
I think I agree with Ralf that these functions are in way too many places for us to reasonably deprecate or remove it right now. That said, I am all for making it clearer in the documentation what the caveats are and to point people to functions that have better defaults. |
I checked off a few things that I think are done. The one big thing that isn't done yet is |
Hi folks - here's the basic implementation of the It's pretty much the same as we have at SciPy. Hope it's useful! |
## Summary <!-- What's the purpose of the change? What does it do, and why? --> Hi! Currently NumPy Python API is undergoing a cleanup process that will be delivered in NumPy 2.0 (release is planned for the end of the year). Most changes are rather simple (renaming, removing or moving a member of the main namespace to a new place), and they could be flagged/fixed by an additional ruff rule for numpy (e.g. changing occurrences of `np.float_` to `np.float64`). Would you accept such rule? I named it `NPY201` in the existing group, so people will receive a heads-up for changes arriving in 2.0 before actually migrating to it. ~~This is still a draft PR.~~ I'm not an expert in rust so if any part of code can be done better please share! NumPy 2.0 migration guide: https://numpy.org/devdocs/numpy_2_0_migration_guide.html NEP 52: https://numpy.org/neps/nep-0052-python-api-cleanup.html NumPy cleanup tracking issue: numpy/numpy#23999 ## Test Plan A unit test is provided that checks all rule's fix cases.
There are still many issues with the NumPy 2.0 API Changes label. Are they part of this issue or is there another place to track them? |
Those are for all 2.0 API changes (e.g., NEP 56), not just this issue. This one is done after the documentation update is written (which I plan to do this week). |
I suspect I'll need help with the 2.0.0 release note. If there are things -- links, etc. -- that you think should be in there, it would be helpful if you could add them. |
The large reference guide update was done in gh-25650 and gh-25674, this is now in much better shape. A few
@charris it's quite hard to see where the notes are at right now; a large majority of release note updates are already in I'm sure. How about we run If that sounds good, I'm happy to open the first PR for towncrier too. |
Will Python 3.13 be supported? |
It may work, but in principle no. There cannot be wheels until 3.13rc1, and you're better off building from |
There are already 3.13 wheels on nightly. |
Uh oh!
There was an error while loading. Please reload this page.
This tracking issue is meant to track the status and tasks of the Python API cleanup project for NumPy 2.0 (NEP 52, currently in draft status). Note: this is currently far from a complete plan, I wanted to make a start though at tracking things and making them actionable.
This issue is probably also a good place to suggest additional APIs to move/remove/deprecate, or other open issues that are related and can be tackled.
As a way of working, it'd be good to check usages of the function/object one is working on in downstream libraries (SciPy, scikit-learn, pandas is a good start), clean those up, then remove the object in question from the API. That has at least two benefits:
Cleaning up the main namespace
np.inf
andnp.nan
aliasesnp.compat
: API: deprecate compat and selected lib utils #23830Cleaning up the submodule structure
numpy.lib
: ENH: Overhaul of NumPylib
namespace [NEP 52] #24507Reducing the number of ways to select dtypes
Actionable:
np.isdtype
sctypeDict
generate a deprecation warning. We cannot completely remove it because jax does this. See DEP: Deprecate registering dtype names with np.sctypeDict? #24699.sctype
related things (see ENH: add a canonical way to determine if dtype is integer, floating point or complex #17325 (comment))issubsctype
,sctypeDict
& co from SciPy, pandas (both have a few occurrences) and perhaps a few other large downstream librariesCleaning up the niche methods on
numpy.ndarray
There are the ones listed in the NEP right now
.setitem
.newbyteorder
.ptp
Doing the above ones will give a better idea about the amount of effort involved, and may help with then identifying a next set.
Documentation
ruff
by @mtsokol.. legacy::
Sphinx directive - in progress: DOC: Add legacy directive to mark outdated objects #24939.. legacy::
directiveThe text was updated successfully, but these errors were encountered: