Thanks to visit codestin.com
Credit goes to github.com

Skip to content

API: Cleaning numpy/__init__.py and main namespace - Part 1 [NEP 52] #24316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 7, 2023

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Aug 2, 2023

Relevant issues #24306 #23999

Hi @rgommers @seberg @ngoldbaum,

Here I share a draft PR connected to issue #24306. It mostly covers restructuring of numpy/__init__.py file.

In a nutshell:

  • Every item of the main namespace is imported explicitly from predefined API list (the list is still being discussed in the related issue) rather than implicitly with * import.
  • The file _main_namespace_definition.py is meant to be a contract of the main namespace and is used for defining __dir__ and __all__ attributes of NumPy's top namespace. Therefore it is versioned and can't be altered without modifying file in question.
  • Removing from .core import * and from .lib import * uncovered some cyclic dependencies in the codebase (for now I explicitly imported these names that are used internally with np.* causing a cycle), but ideally there should be none of them.
  • I refactored some parts of __init__.py file that I thought were obsolete.
  • I think it's easier to review modified __init__.py as continuous file, rather than a diff.

Please share your feedback!

@mtsokol mtsokol marked this pull request as draft August 2, 2023 15:46
@rgommers rgommers added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Aug 2, 2023
@rgommers
Copy link
Member

rgommers commented Aug 2, 2023

Thanks @mtsokol, this is a useful thing to tackle now. I'm thinking we may want to identify the parts tht can be merged straight away and do this in a few different PRs; the _main_namespace_definition.py seems like it'll stay WIP for a while, while some other parts are quite straightforward. I'm thinking first PR the obvious cleanups (I can add review comments on which ones are mergeable now), and a second one only doing import * removals. WDYT?

@mtsokol
Copy link
Member Author

mtsokol commented Aug 2, 2023

@rgommers, works for me! Please comment these items - I will work on it tomorrow. Then I guess first PR will be about "cleanup __init__.py", by removing outdated items and of course import *. Then a separate PR will be to introduce this main namespace contract.

@mtsokol mtsokol force-pushed the overhaul-of-main-namespace branch from b970255 to 1e1b754 Compare August 3, 2023 09:30
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtsokol I added the comments regarding what can be merged in a first PR. That should make the diff here a lot smaller.

Also note that if you push updates to this PR, it's probably preferable to add [skip ci] in the commit message - no need for a full battery of CI here yet.

@mtsokol
Copy link
Member Author

mtsokol commented Aug 3, 2023

@mtsokol I added the comments regarding what can be merged in a first PR. That should make the diff here a lot smaller.

Also note that if you push updates to this PR, it's probably preferable to add [skip ci] in the commit message - no need for a full battery of CI here yet.

@rgommers it works for me! Then this PR will be for the first batch of changes (general cleaning of numpy/__init__.py and removing NumPy's warnings and exceptions from the main namespace).

The second, a separate PR, will cover solving cyclic dependencies and getting rid of from ... import *, then the third one will introduce a separate file/contract for explicit definition of the main namespace (defining globals(), __all__ and __dir__ this way).

I'm running CI here because I prepared first batch of changes (also, I'm working on reflecting them in other libraries).

@mtsokol mtsokol force-pushed the overhaul-of-main-namespace branch from bab34f5 to c55bee6 Compare August 3, 2023 12:08
@mtsokol mtsokol marked this pull request as ready for review August 3, 2023 12:10
@mtsokol
Copy link
Member Author

mtsokol commented Aug 3, 2023

@rgommers Looks that there's still a RankWarning class that needs to be removed from top-level __init__.pyi. It's originally from numpy.polynomial.polyutils.py, in my opinion it's domain specific to polynomials, so it doesn't need to be moved to numpy.exceptions. WDYT?

@rgommers
Copy link
Member

rgommers commented Aug 3, 2023

@rgommers Looks that there's still a RankWarning class that needs to be removed from top-level __init__.pyi. It's originally from numpy.polynomial.polyutils.py, in my opinion it's domain specific to polynomials, so it doesn't need to be moved to numpy.exceptions. WDYT?

That's not completely clear cut (right now it goes with np.polyfit), so I'd not touch it here and add it to your "tentative" list.

@mtsokol mtsokol force-pushed the overhaul-of-main-namespace branch from 53b8a39 to 775b5dd Compare August 3, 2023 14:18
@mtsokol mtsokol changed the title [WIP] API: Overhaul of NumPy main namespace [NEP 52] API: Cleaning numpy/__init__.py and main namespace - Part 1 [NEP 52] Aug 3, 2023
@mtsokol
Copy link
Member Author

mtsokol commented Aug 3, 2023

I think it's ready for a review: In files where an exception/warning was used only once I used np.exceptions.<>. In cases where it was used multiple times I added an explicit import.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks Mateusz! And thank you for the due diligence and fixing things in Matplotlib, Pandas, SciPy, scikit-learn and JAX.

The list of differences between the np.__dir__() output on this PR vs. 1.25.0 is:

{'ERR_CALL',
 'ERR_DEFAULT',
 'ERR_IGNORE',
 'ERR_LOG',
 'ERR_PRINT',
 'ERR_RAISE',
 'ERR_WARN',
 'SHIFT_DIVIDEBYZERO',
 'SHIFT_INVALID',
 'SHIFT_OVERFLOW',
 'SHIFT_UNDERFLOW',
 '__deprecated_attrs__',
 '__expired_functions__',
 '_builtins',
 '_financial_names',
 '_using_numpy2_behavior',
 'cast',
 'compat',
 'fastCopyAndTranspose',
 'geterrobj',
 'kernel_version',
 'lookfor',
 'numarray',
 'oldnumeric',
 'set_numeric_ops',
 'seterrobj',
 'source'}

all those things have indeed been removed, so this looks good.

There's nothing in here that should be controversial, so let's get it in to keep the ball rolling.

# but do not use them, we define them here for backward compatibility.
oldnumeric = 'removed'
numarray = 'removed'

def __getattr__(attr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a next PR: copying the pattern from scipy/__init__.py for __getattr__ to import all submodules in a lazy way rather than only numpy.testing would be useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! And as we discussed, this will help fixing cyclic dependencies.

@rgommers rgommers merged commit c8e2343 into numpy:main Aug 7, 2023
@rgommers rgommers added this to the 2.0.0 release milestone Aug 7, 2023
@mtsokol mtsokol deleted the overhaul-of-main-namespace branch August 7, 2023 20:20
@seberg
Copy link
Member

seberg commented Aug 8, 2023

Just a note, the effective change here was that previously ComplexWarning and some other errors were available as np.ComplexWarning but hidden because we wanted to move to np.exceptions.ComplexWarning. This finalizes the move without a deprecation.
I can see that mostly being relevant for larger libraries who we can expect to deal with it, but if anyone thinks that a wider range of users have such code, I would be fine with keeping the "hidden" status also for a bit longer.

@mtsokol
Copy link
Member Author

mtsokol commented Aug 8, 2023

Just a note, the effective change here was that previously ComplexWarning and some other errors were available as np.ComplexWarning but hidden because we wanted to move to np.exceptions.ComplexWarning. This finalizes the move without a deprecation. I can see that mostly being relevant for larger libraries who we can expect to deal with it, but if anyone thinks that a wider range of users have such code, I would be fine with keeping the "hidden" status also for a bit longer.

I can add a custom message about these warnings/exceptions when accessing them from the main namespace (same as __expired_functions__ worked in __init__.py).

@rgommers
Copy link
Member

rgommers commented Aug 8, 2023

I can add a custom message about these warnings/exceptions when accessing them from the main namespace (same as __expired_functions__ worked in __init__.py).

I'd prefer not to do that for now - at least not until/unless we start seeing a real need. The change is trivial and should be easy to find in case one runs into it. If we are going to add messages for all changes, we will again end up with hundreds of lines of cruft in __init__.py that are going to hang around there for years.

We already planned to have a single doc page with all these changes for 2.0; no need to do double work here. Everyone who uses nightlies can easily deal with this.

@seberg
Copy link
Member

seberg commented Aug 8, 2023

Right agreed. My concern is currently only about the sum of changes being overwhelming. For some things we have good reasons to do so because they are things that nobody understands or every dev understands that their logic is flawed. For these, they are a bit fuzzy to me: it is basically a file with "legacy aliases" that is just a long list we would keep long enough that users can adopt it without a try/except or if numpy_version >.

The branching is the real reason here, as we have said many times asking users to change things twice isn't great. I don't think this matters for larger libraries, they are used to it, but it does matter for scripts/small libraries.

@seberg
Copy link
Member

seberg commented Aug 8, 2023

In other words, the reason I am fine with it unless someone disagrees, is that I think for the users that we should care about (those who are not used to adding such branching), my guess is that there are very few who will notice the changes.

@rgommers
Copy link
Member

rgommers commented Aug 8, 2023

Exactly, I agree with the "no branching for the average user" rule. These are examples I think that are well below the line of usage frequency. Things like widely-used aliases (e.g., absolute as alias of abs) are used enough that we should keep it as a hidden aliases. And the line is somewhere in the middle between those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. Numpy 2.0 API Changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants