Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Numpy 1.25 now has min and max on __all__ (from numpy import *) #24229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
e-carlin opened this issue Jul 21, 2023 · 10 comments
Closed

BUG: Numpy 1.25 now has min and max on __all__ (from numpy import *) #24229

e-carlin opened this issue Jul 21, 2023 · 10 comments

Comments

@e-carlin
Copy link

e-carlin commented Jul 21, 2023

Describe the issue:

This isn't exactly a bug per se but a surprise after updating to 1.25. So let me know if this should be moved elsewhere.

In 1.25 from numpy import * brings in names like min and max which override the builtin min and max.

The specific case that lead to our failure was in a dependency we use. This is the import and this is the use of min.

To fix I initially tried patching the dependency (warp) in our build system with from builtins import * after the from numpy import *. But, warp also does from gist import * which also does a from numpy import *. I could patch the dependency of a dependency but I’m wondering if there is a better "upstream" fix.

I believe this is the change that caused the bug.

It looks like some thought has already been given to builtin names and from numpy import * (here and here).

I’m wondering if removing builtin names from __all__ or at least removing min and max is something you all are open to considering?

Reproduce the code example:

$ pip install numpy==1.25.1
<snip>
$ python
Python 3.9.15 (main, Jul 15 2023, 14:16:53)
[GCC 12.2.1 20221121 (Red Hat 12.2.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.25.1'
>>> type(min)
<class 'builtin_function_or_method'>
>>> from numpy import *
>>> type(min)
<class 'numpy._ArrayFunctionDispatcher'>
$ pip install numpy==1.24.0
<snip>
>>> numpy.__version__
'1.24.0'
>>> type(min)
<class 'builtin_function_or_method'>
>>> from numpy import *
>>> type(min)
<class 'builtin_function_or_method'>

Error message:

One specific example is this

~$ python -c 'import warp'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/warp/__init__.py", line 1, in <module>
    from .warp import *
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/warp/warp.py", line 263, in <module>
    top.ssn = int(min(sys.maxsize,1./numpy.finfo('d').eps)/npes*me + 1)
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 2953, in min
    return _wrapreduction(a, np.minimum, 'min', axis, None, out,
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 88, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: 'numpy.float64' object cannot be interpreted as an integer

An example you can more easily run is

~$ python
Python 3.9.15 (main, Jul 15 2023, 14:16:53)
[GCC 12.2.1 20221121 (Red Hat 12.2.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> min(1,1./2)
0.5
>>> from numpy import *
>>> min(1,1./2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 2953, in min
    return _wrapreduction(a, np.minimum, 'min', axis, None, out,
  File "/home/vagrant/.pyenv/versions/py3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 88, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: 'float' object cannot be interpreted as an integer

Runtime information:

1.24.0

~$ python -c 'import sys, numpy; print(numpy.__version__); print(numpy.show_runtime()); print(sys.version)'
1.24.0
[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX'],
                      'not_found': ['F16C',
                                    'FMA3',
                                    'AVX2',
                                    'AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Sandybridge',
  'filepath': '/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so',
  'internal_api': 'openblas',
  'num_threads': 4,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.21'}]
None
3.9.15 (main, Jul 15 2023, 14:16:53)
[GCC 12.2.1 20221121 (Red Hat 12.2.1-4)]

1.25.1

~$ python -c 'import sys, numpy; print(numpy.__version__); print(numpy.show_runtime()); print(sys.version)'
1.25.1
[{'numpy_version': '1.25.1',
  'python': '3.9.15 (main, Jul 15 2023, 14:16:53) \n'
            '[GCC 12.2.1 20221121 (Red Hat 12.2.1-4)]',
  'uname': uname_result(system='Linux', node='v2.radia.run', release='6.2.15-100.fc36.x86_64', version='#1 SMP PREEMPT_DYNAMIC Thu May 11 16:51:53 UTC 2023', machine='x86_64')},
 {'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX'],
                      'not_found': ['F16C',
                                    'FMA3',
                                    'AVX2',
                                    'AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Sandybridge',
  'filepath': '/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/numpy.libs/libopenblas64_p-r0-7a851222.3.23.so',
  'internal_api': 'openblas',
  'num_threads': 4,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.23'}]
None
3.9.15 (main, Jul 15 2023, 14:16:53)
[GCC 12.2.1 20221121 (Red Hat 12.2.1-4)]

Context for the issue:

As stated above, dependencies we use do from numpy import * which caused breakage in numpy 1.25. This went undetected until our CD caught it during a release (issue).

A cursory look at our production image reveals a few packages that could have problems:

$ docker run --rm -it radiasoft/sirepo:prod bash -c 'grep --include="*.py" -r "from numpy import \*" ~/.pyenv'
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/Forthon/_Forthon.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/IPython/core/interactiveshell.py:            Whether to do `from numpy import *` and `from pylab import *`
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/IPython/core/magics/pylab.py:            from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/IPython/core/pylabtools.py:             "from numpy import *\n")
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/gist/colorbar.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/gist/gist.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/gist/pl3d.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/gist/shapetest.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/gist/slice3.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/matplotlib/pylab.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/attic/hibeamdefaults.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/attic/namelist.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/data_dumping/PWhdf.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/diagnostics/gistdummy.py:#from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/diagnostics/plarr3d.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/utils/optimizer.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp/warp.py:from numpy import *
/home/vagrant/.pyenv/versions/3.9.15/envs/py3/lib/python3.9/site-packages/warp_parallel/__init__.py:from numpy import *

A search on github reveals 5.8k results that use from numpy import * followed by a use of min or max. Just from numpy import * has 39.7k results.

I know your docs say not to use from numpy import * and as a general rule of thumb it is frowned on (ex Google python style guide). I personally use qualified imports but, as evidence from the searches, many people don't follow this advice.

I realize those searches are imperfect and won't cause a break in all cases. But, I think the use of from numpy import * could be widespread enough that care needs to be put into what is brought in when doing so.

I don't know what the "best" fix is. It may just be I need to patch the dependencies I use. But, I wanted to float the problem out to get your thoughts.

Thanks for your time.

@seberg seberg added this to the 1.25.2 release milestone Jul 21, 2023
@rgommers
Copy link
Member

Thanks for the clear report on this @e-carlin. I think we can filter min and max out of all at least for the upcoming 1.25.x and 1.26.x releases, in order to give the problematic packages here time to fix their code. I see @seberg already added this to the 1.25.2 milestone, so it looks like we can land it there.

For 2.0 I'd much prefer to keep things as they are now in main, because min and max are important functions and it's correct for them to be in __all__.

If you could file bug reports against the dependencies you are having problems with, that would be helpful.

Also, this tool to automatically replace all * imports with explicit imports may be useful to save folks time when fixing this and other star imports: https://github.com/asmeurer/removestar.

@seberg
Copy link
Member

seberg commented Jul 22, 2023

@rgommers yes, for backporting I it seemed to make sense (btw. PR welcome).

TBH, I am not sure I am worried about keeping th __all__ exclusions in the future. from numpy import * will always have these quirks. I am not sure that overriding the builtins is really much (+better) in the long term.
So, I am not sure that breaking sloppy libs actually moves us forward much?

@rgommers
Copy link
Member

__all__ contains the list of public names of a module. E.g. from https://docs.python.org/3/reference/simple_stmts.html:

The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character ('_'). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).

There are a few different ways to try and determine public API, rather than a one-and-only way - but checking __all__ is certainly a popular one.

So now that we're finally going to have a clear public/private split, we should not carry weird hacks but have it be correct in my opinion.

@seberg
Copy link
Member

seberg commented Jul 22, 2023

Well, but "public names" there refers explicitly to those exported by from module import *, so don't really think we have to worry about public vs. non-public and should focus only from numpy import * behavior.
There is also __dir__ which is what you see in tab completion, etc. which clearly will list these.

@rgommers
Copy link
Member

so don't really think we have to worry about public vs. non-public

I don't think that is true. For introspection tooling for example, __all__ and other dunder methods are actually important. I've personally written such tooling that uses __all__. And in the past month alone I've gotten "is this a bug" questions from Numba and PyTorch devs about issues with __module__ and __signature__.

So I would say that all these dunder attributes/methods have clearly defined meanings and expectations, and a major release is as good a time as any to fix them. Especially if the fix was already made, I don't really want to see it reverted because some old packages with bad habits relied on the technically incorrect contents.

In a more general way, there are many other things in __init__.py that are very hacky and I'd like to clean up. Explicit del statements, all the * imports, attributes like oldnumeric that are ad hoc defined for legacy reasons, the distutils-specific __NUMPY_SETUP__ hack, etc. Getting __init__.py to a clean state plus moving some heavy imports to lazy (with __getattr__ and __dir__) will be useful for both maintainability and import times.

(btw. PR welcome).

I will try to do that today.

@seberg
Copy link
Member

seberg commented Jul 22, 2023

OK, I am sure it can be annoying, just not sure about it being worth too much churn. OTOH, it took a few weeks for an issue, so maybe it isn't that much churn.

rgommers added a commit to rgommers/numpy that referenced this issue Jul 22, 2023
This is a workaround for breakage in downstream packages that
do `from numpy import *`, see numpygh-24229.

Note that there are other builtins that are contained in `__all__`:

```
>>> for s in dir(builtins):
...     if s in np.__all__:
...         print(s)
...
all
any
divmod
sum
```

Those were already there before the change in 1.25.0, and can stay.
The downstream code should be fixed before the numpy 2.0 release.

Closes numpygh-24229
@rgommers
Copy link
Member

I checked all builtins, and round is also a new entry, so I removed it too in gh-24234. The other four builtins (all, any, sum, divmod) were already present in __all__ for longer, so won't be a problem for anyone.

Let's see if more complaints come in, but I'd say it's been 5 weeks already indeed, so anything relevant should have noticed by now - and issues should then be filed to get those fixed.

@mhvk
Copy link
Contributor

mhvk commented Jul 23, 2023

👍 to keeping everything in __all__. E.g., sphinx builds also rely on it.

@seberg
Copy link
Member

seberg commented Jul 24, 2023

Closing, reverted for 1.25 (and thus 1.26) and unless there proofs to be more disruption I expect we will keep it in main, so libraries failing here will have to be fixed.

@seberg seberg closed this as completed Jul 24, 2023
@e-carlin
Copy link
Author

Thanks all for the quick response and fix!

I'll start working on submitting patches to our dependencies in anticipation of v2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants
@seberg @rgommers @mhvk @e-carlin and others