Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tracking issue: Python API cleanup for NumPy 2.0 (NEP 52) #23999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
16 of 18 tasks
rgommers opened this issue Jun 19, 2023 · 33 comments
Closed
16 of 18 tasks

Tracking issue: Python API cleanup for NumPy 2.0 (NEP 52) #23999

rgommers opened this issue Jun 19, 2023 · 33 comments

Comments

@rgommers
Copy link
Member

rgommers commented Jun 19, 2023

This tracking issue is meant to track the status and tasks of the Python API cleanup project for NumPy 2.0 (NEP 52, currently in draft status). Note: this is currently far from a complete plan, I wanted to make a start though at tracking things and making them actionable.

This issue is probably also a good place to suggest additional APIs to move/remove/deprecate, or other open issues that are related and can be tackled.

As a way of working, it'd be good to check usages of the function/object one is working on in downstream libraries (SciPy, scikit-learn, pandas is a good start), clean those up, then remove the object in question from the API. That has at least two benefits:

  1. it avoids CI in those projects to start failing when the change lands in a NumPy nightly,
  2. it gives a good sense of how easy it is to replace the usage with the recommended replacement

Cleaning up the main namespace

Cleaning up the submodule structure

Reducing the number of ways to select dtypes

Actionable:

Cleaning up the niche methods on numpy.ndarray

There are the ones listed in the NEP right now

  • .setitem
  • .newbyteorder
  • .ptp

Doing the above ones will give a better idea about the amount of effort involved, and may help with then identifying a next set.

Documentation

@rgommers rgommers added this to the 2.0.0 release milestone Jun 19, 2023
@rgommers rgommers moved this to 🔖 Proposal in NumPy 2.0 Roadmap Jun 19, 2023
@rgommers rgommers moved this from 🔖 Proposal to 🏗 In progress in NumPy 2.0 Roadmap Jun 19, 2023
@ev-br
Copy link
Contributor

ev-br commented Jun 21, 2023

This issue is probably also a good place to suggest additional APIs to move/remove/deprecate

In no particular order:

  • np.cast 🌋 (seberg: seems strange enough...)
  • set_numeric_ops (probably is being deprecated already?) (seberg: already gone)
  • usehugepage (make private?) 🌋 (seberg: just an oops)
  • {set,get}bufsize (make private?) 🔥
  • source 🌋
  • who 🌋
  • info
  • disp 🌋
  • safe_eval 🌋
  • base_repr, binary_repr, format_float_{positional, scientific}: maybe merge into a submodule. Or just leave them be if moving is too much.
  • require
  • lookfor 🌋
  • kernel_version (returns (5, 15); what is this number?) 🌋 (seberg: an oops together with hugepages, OK to remove even in a minor release.)
  • compare_charrays (maybe together with a bigger work on new string dtypes?) 🌋 (seberg: the feature you lose is rstrip=True, I don't see users using it and workarounds are possible if slow.)
  • np.char (np.char? says it exists for backcompat with numarray and is not recommended as of numpy 1.4?)
  • array2string, array_repr, array_str : maybe not all three are needed as public names?

@seberg
Copy link
Member

seberg commented Jun 21, 2023

I added a 🌋 where I would agree with just removing (which means PRs welcome to get it over with!).
One note: The chararray class is probably 🌋 (as the NPY_CHAR dtype), the ufunc-like functions in the namespace np.char. are not though. I will entertain removal, but think it would probably require something like numpy-financial and not sure its worth to prioritize at this time.

@ev-br
Copy link
Contributor

ev-br commented Jun 21, 2023

Do 🌋 need a deprecation period?

@seberg
Copy link
Member

seberg commented Jun 21, 2023

Not in my logic there (which is not quite true because some of them have a deprecation).

Basically, I don't want to worry at all if:

  • Things are relatively niche
  • The replacement is not too complicated (removal is always a clear error, so not high impact by itself).

That applies for my volcanos, yes. (char is maybe a bit higher impact, but it's also mental load if we try to improve dtypes and was de-facto deprecated long ago. seterrobj is an example for something where I just applied the same rules, which greatly reduced the mental load fixing np.errstate.)

I have added a new <prnumber>.python_removal.rst and <prnumber>.c_api_removal.rst to the release notes for this purpose.

EDIT: That said, I don't want to overload users with many changes, so if a function is probably used a bit more and not in the way... maybe we don't need to worry about removing it quickly.

@ev-br
Copy link
Contributor

ev-br commented Jun 22, 2023

A couple of other suggestions (might be less clear-cut though):

  • use positional-only arguments a bit more. My pet peeve is arange(start, /, stop=None, step=1, ...). The current arange signature (w/o "/") is not easy to represent in pure python.
  • make out= and dtype= arguments keyword-only
  • remove the out1=, out2= arguments of nout=2 ufuncs, keep only the out=2-tuple version.

@seberg
Copy link
Member

seberg commented Jun 22, 2023

I am OK with making out args kwonly, don't care about doing it without a deprecation but its probably fine (I suspect the main users doing this were trying to safe kwarg parsing times historically, which is pretty meaningless now).
I am generally OK with making things kwargs only when it seems rather awkward to pass positionally anyway, I don't agree with arange since its just the same as Python's range.

@seberg
Copy link
Member

seberg commented Jun 22, 2023

Sorry... that was making start positional only, that makes sense!

@ngoldbaum
Copy link
Member

I will entertain removal, but think it would probably require something like numpy-financial and not sure its worth to prioritize at this time.

Agreed. There either needs to be a separate package for string ufuncs that depends on a unicode library or numpy itself needs a namespace for string ufuncs and could bundle a lightweight unicode library like utf8proc. The namespace could be np.char for backward compatibility, but we could give it a new name too if we don't want the chararray baggage. I don't think string ufuncs make sense in the main numpy namespace because they only make sense to use with string arrays.

That said, it'll be a fair bit of work to make string ufuncs more of a thing and there's no reason to block cleanups of the truly unused stuff in np.char.

tylerjereddy added a commit to tylerjereddy/scipy that referenced this issue Jul 7, 2023
* replace `*sctype*` NumPy usage per
numpy/numpy#23999
and NEP52 in preparation for NumPy 2.0

* there seem to be straightforward replacements
that still pass the testsuite in all cases

* `git grep -E -i "sctype"` is clean on this branch
(only present in comments for clarity where needed)

* there may be better canonical ways to do some
of these things in the future, though considerable
confusion remains per numpy/numpy#17325

* `UMFPACK`-related changes were not tested locally
(wasn't particularly friendly for PyPI-based setup/venv)

[skip circle]
@lcrmorin
Copy link

lcrmorin commented Jul 8, 2023

regarding the API, one thing that bother me the most is the inconsistency in dealing with pandas DataFrames. I don't have all the infos to source the problem / if there is any history in dealing with it but now applying some numpy function to a padas dataframe give an inconsistent output format, eiher a numpy array or a pandas dataframe. It doesn't seems limited to a specific function, Hence I didn't open a specific issue. Could checking all the function and making them consistent be part of this plan ?

@ngoldbaum
Copy link
Member

I noticed today that passing np.character to np.dtype leads to a deprecation warning saying that np.character is deprecated. Accessing np.character should probably raise a deprecation warning too.

@seberg
Copy link
Member

seberg commented Jul 8, 2023

Please, please, just propose PRs on the main branch that you want (for removals or simpler changes at least), np.seterrobj and np.geterrobj are already gone, now maybe those are not "numpy 2.0" worthy, but still.

The only issue I see with np.character is that np.dtype("c") (and related things) never gave the warning. The only actual feature I am aware of right now is that np.array("asdf", dtype="c") unpacks the "asdf", I doubt many rely on that and everyone else should just use "S1" to begin with.

That may well be the only real additional code-path, so if anyone asks to keep it, I won't care enough to push it through. But, I am +1 to just remove it. It servers no purpose except to confuse users as far as I can tell; and because of that, there should also be practically no users affected by just deleting it.)

@rgommers
Copy link
Member Author

rgommers commented Jul 9, 2023

one thing that bother me the most is the inconsistency in dealing with pandas DataFrames

This is an issue that should be dealt with on the Pandas side. Pandas uses NumPy extensively and has a dependency on it, while NumPy knows nothing about Pandas. Hence there is nothing we can or will do about this in NumPy. The Pandas team already knows what the main pain points are and there are open issues about it on the Pandas issue tracker. So no need to discuss it more here.

alugowski pushed a commit to alugowski/scipy that referenced this issue Jul 16, 2023
* replace `*sctype*` NumPy usage per
numpy/numpy#23999
and NEP52 in preparation for NumPy 2.0

* there seem to be straightforward replacements
that still pass the testsuite in all cases

* `git grep -E -i "sctype"` is clean on this branch
(only present in comments for clarity where needed)

* there may be better canonical ways to do some
of these things in the future, though considerable
confusion remains per numpy/numpy#17325

* `UMFPACK`-related changes were not tested locally
(wasn't particularly friendly for PyPI-based setup/venv)

[skip circle]
@mtsokol
Copy link
Member

mtsokol commented Aug 14, 2023

Hi All! As work on this is already underway, as a side-note, I wanted to share that I started my internship at Quansight Labs from the beginning of August, under the supervision of @rgommers and @ngoldbaum, and NEP 52 will be my main focus for the next 3 months!

@seberg
Copy link
Member

seberg commented Aug 17, 2023

@stefanv, @melissawm, @charris and also @rkern just pinging a few to get an opinion on whether we should be more careful. I like removing some things, but am happy to undo also some of the above.

In general I am mostly worried about the sum of changes, I am still in favor of most individual change if it only smells like it is not just niche/weird/broken but also in the way of other cleanups (the errstate related removals are a clar example: they unblocked fixing errstate).

Things like np.cast[type]() or np.ComplexWarning are not used much, but they are probably used occasionally and removing them may mean smaller libs don't transition smoothly to 2.0 when we have to pay very little to do so.

But for those that are just weird/niche/broken, I would honestly prefer if we keep it in most cases, but:

  • Remove all docs.
  • hide it from __dir__ and maybe __all__
  • Add a deprecation warning now at least if we think it really shouldn't be used: Also good because it gives a good way to inform user about the fix!

TBH, thinking back I thought this was the intention of NEP 42: clean up the namespace, but not necessarily but removing/moving functions immediately?

@stefanv
Copy link
Contributor

stefanv commented Aug 17, 2023

Since we almost never get a chance to tidy up the API, I'd like to see us to use this opportunity to do so. That said, for users the pain of API transitions is real, as we all learned with Python 3, so whatever we can do to minimize it we should. Keeping hidden API around for a while is okay, I think, as long as it raises deprecation messages that guide the user in how to transition their code, and make it clear when the function will be removed. We don't want to carry those functions around forever, and the migration guide should make it clear that they are no longer officially supported.

@stefanv
Copy link
Contributor

stefanv commented Aug 19, 2023

To clarify my thinking: it's that 2.0 is an opportunity for developers to rid themselves of support burden. We do so much backward compatibility work, at significant cost—and it also prevents us from refactoring APIs as necessary. Here, we can afford to say "we're no longer going to be supporting this function, and we're also not going to go through the standard deprecation pathway". Hiding the function and displaying a message to the user is a concession, and one that doesn't hurt us very much. The gist of it is saying: "you've been warned that this is no longer part of 2.0, so don't be surprised when it disappears". I can see the argument for having it simply be removed (it's even easier on the developers, and the user won't accidentally keep using it); but in that case you want to make it abundantly clear in the migration guide what the user is expected to do. The advantage (for the user) of having it in code is that they can run their code, and receive line-by-line guidance on what to change—which feels like an easier process than working through a manual. So, I suppose instead of a warning, an error may be a better form of guidance?

Given the above, I'd say I'm fine with (1) or (2) with errors. I don't like any option without a fixed end point.

@rgommers
Copy link
Member Author

Thanks @stefanv. "(2) with errors" seems like a user-friendlier version of (1), so that seems nice. And overlaps with #24306 (comment). Using for example __getattr__ to give the best possible errors would be a good outcome of this discussion I think.

@mtsokol
Copy link
Member

mtsokol commented Sep 20, 2023

A small update: NumPy 2.0 Migration Guide for Python API is now live on main branch:
https://output.circle-artifacts.com/output/job/955013bd-6916-4383-99c2-68631ee066ec/artifacts/0/doc/build/html/numpy_2_0_migration_guide.html

@mattip
Copy link
Member

mattip commented Sep 20, 2023

It is also available in the devdocs as https://numpy.org/devdocs/numpy_2_0_migration_guide.html

@mattip
Copy link
Member

mattip commented Sep 21, 2023

Is there any desire to move array.testing.assert_array_equal to the not-recommended np.testing.assert* functions? See also the short exchange on PR #24667 about what to do with the not-recommended functions and how hard it would be to deprecate them.

@ngoldbaum
Copy link
Member

I think I agree with Ralf that these functions are in way too many places for us to reasonably deprecate or remove it right now. That said, I am all for making it clearer in the documentation what the caveats are and to point people to functions that have better defaults.

@ngoldbaum
Copy link
Member

I checked off a few things that I think are done. The one big thing that isn't done yet is np.isdtype, I think.

@melissawm
Copy link
Member

Hi folks - here's the basic implementation of the .. legacy:: directive: #24939

It's pretty much the same as we have at SciPy. Hope it's useful!

charliermarsh pushed a commit to astral-sh/ruff that referenced this issue Nov 3, 2023
## Summary

<!-- What's the purpose of the change? What does it do, and why? -->

Hi! Currently NumPy Python API is undergoing a cleanup process that will
be delivered in NumPy 2.0 (release is planned for the end of the year).
Most changes are rather simple (renaming, removing or moving a member of
the main namespace to a new place), and they could be flagged/fixed by
an additional ruff rule for numpy (e.g. changing occurrences of
`np.float_` to `np.float64`).

Would you accept such rule?  

I named it `NPY201` in the existing group, so people will receive a
heads-up for changes arriving in 2.0 before actually migrating to it.

~~This is still a draft PR.~~ I'm not an expert in rust so if any part
of code can be done better please share!

NumPy 2.0 migration guide:
https://numpy.org/devdocs/numpy_2_0_migration_guide.html
NEP 52: https://numpy.org/neps/nep-0052-python-api-cleanup.html
NumPy cleanup tracking issue:
numpy/numpy#23999


## Test Plan

A unit test is provided that checks all rule's fix cases.
@mattip
Copy link
Member

mattip commented Jan 16, 2024

There are still many issues with the NumPy 2.0 API Changes label. Are they part of this issue or is there another place to track them?

@rgommers
Copy link
Member Author

Those are for all 2.0 API changes (e.g., NEP 56), not just this issue. This one is done after the documentation update is written (which I plan to do this week).

@charris
Copy link
Member

charris commented Jan 16, 2024

This one is done after the documentation update is written

I suspect I'll need help with the 2.0.0 release note. If there are things -- links, etc. -- that you think should be in there, it would be helpful if you could add them.

@rgommers
Copy link
Member Author

The large reference guide update was done in gh-25650 and gh-25674, this is now in much better shape. A few .. legacy:: directives can still be sprinkled around, but that is non-critical. Everything here that was needed for 2.0 is done, so I will close this tracking issue.

I suspect I'll need help with the 2.0.0 release note. If there are things -- links, etc. -- that you think should be in there, it would be helpful if you could add them.

@charris it's quite hard to see where the notes are at right now; a large majority of release note updates are already in I'm sure. How about we run towncrier to empty out the backlog of loose snippets, and after that I'll open a PR to update the notes for anything important due to NEP 52 (and NEP 56)?

If that sounds good, I'm happy to open the first PR for towncrier too.

@rgommers rgommers moved this from 🏗 In progress to ✅ Done in NumPy 2.0 Roadmap Jan 31, 2024
@rootsmusic
Copy link

Will Python 3.13 be supported?

@rgommers
Copy link
Member Author

Will Python 3.13 be supported?

It may work, but in principle no. There cannot be wheels until 3.13rc1, and you're better off building from main until that happens.

@andyfaff
Copy link
Member

There are already 3.13 wheels on nightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests