BUG: dtype changed from float64 to int64 in scipy discrete_rv #27054

oscarbenjamin · 2024-07-26T17:02:05Z

Describe the issue:

The issue is seen with the NumPy nightly wheels since a few days ago. I don't know what exactly the cause is because it is something happening inside SciPy but somehow an array ended up being a different dtype when passed through to the _pmf method shown below which causes it to fail with:

  File "/home/oscar/current/active/sympy/t.py", line 6, in _pmf
    return (2/3)*3**(1 - i)
                 ~^^~~~~~~~
ValueError: Integers to negative integer powers are not allowed.

Reproduce the code example:

from scipy.stats import rv_discrete

class rv_exponential(rv_discrete):
    def _pmf(self, i):
        print(i.dtype, i.shape)
        return (2/3)*3**(1 - i)

rv = rv_exponential(a=0.0, b=float('inf'))

print(rv.rvs())

Error message:

$ python t.py 
int64 (28,)
Traceback (most recent call last):
  File "/home/oscar/current/active/sympy/t.py", line 10, in <module>
    print(rv.rvs())
          ^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 3430, in rvs
    return super().rvs(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 1108, in rvs
    vals = self._rvs(*args, size=size, random_state=random_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 1034, in _rvs
    Y = self._ppf(U, *args)
        ^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 1049, in _ppf
    return self._ppfvec(q, *args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2470, in __call__
    return self._call_as_normal(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2463, in _call_as_normal
    return self._vectorize_call(func=func, args=vargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2553, in _vectorize_call
    outputs = ufunc(*inputs)
              ^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 3057, in _drv2_ppfsingle
    qb = self._cdf(b, *args)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 3396, in _cdf
    return self._cdfvec(k, *args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2470, in __call__
    return self._call_as_normal(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2463, in _call_as_normal
    return self._vectorize_call(func=func, args=vargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/numpy/lib/_function_base_impl.py", line 2553, in _vectorize_call
    outputs = ufunc(*inputs)
              ^^^^^^^^^^^^^^
  File "/home/oscar/.pyenv/versions/sympy-3.12.git/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py", line 3392, in _cdf_single
    return np.sum(self._pmf(m, *args), axis=0)
                  ^^^^^^^^^^^^^^^^^^^
  File "/home/oscar/current/active/sympy/t.py", line 6, in _pmf
    return (2/3)*3**(1 - i)
                 ~^^~~~~~~~
ValueError: Integers to negative integer powers are not allowed.

Python and NumPy Versions:

This is seen with the NumPy nightly wheels since a few days ago.

With released NumPy 2.0.1 the code runs to completion with

$ python t.py 
float64 (11,)
float64 (1,)
float64 (6,)
float64 (3,)
float64 (2,)
0

Runtime Environment:

No response

Context for the issue:

This comes from a SciPy issue: scipy/scipy#21272 and a SymPy issue sympy/sympy#26862

The text was updated successfully, but these errors were encountered:

oscarbenjamin · 2024-07-26T19:58:37Z

Is it only possible to build numpy main with cython master right now?

I was getting this error until I installed the Cython nightly build:

          action(self, namespace, argument_values, option_string)
        File "/home/oscar/.pyenv/versions/3.12.0/envs/sympy-3.12.git/lib/python3.12/site-packages/Cython/Compiler/CmdLine.py", line 22, in __call__
          directives = Options.parse_directive_list(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/oscar/.pyenv/versions/3.12.0/envs/sympy-3.12.git/lib/python3.12/site-packages/Cython/Compiler/Options.py", line 533, in parse_directive_list
          raise ValueError('Unknown option: "%s"' % name)
      ValueError: Unknown option: "freethreading_compatible"
      [139/179] Compiling C object numpy/_core/lib_simd.dispatch.h_baseline.a.p/meson-generated__simd.dispatch.c.o
      [140/179] Compiling C object numpy/_core/lib_simd.dispatch.h_SSE42.a.p/meson-generated__simd.dispatch.c.o
      [141/179] Compiling C object numpy/_core/lib_simd.dispatch.h_AVX512_SKX.a.p/meson-generated__simd.dispatch.c.o
      ninja: build stopped: subcommand failed.
      [end of output]

oscarbenjamin · 2024-07-26T20:32:25Z

Bisected to 6c91567 from gh-26766.

CC @mtsokol

mtsokol · 2024-07-28T12:01:29Z

I need to take a look what is going on in SciPy but I think SciPy relied on floor, ceil, and trunc output cast to float64 which IMO is incorrect, so I think it requires a fix on SciPy side perform explicit casting to float64.

Siddharth-Latthe-07 · 2024-07-29T13:41:37Z

@oscarbenjamin The issue you're experiencing appears to be due to a change in behavior in the recent NumPy nightly builds. The error message indicates that the i array, which is passed to your _pmf method, has an integer data type (int64), causing an error when a negative integer power operation is attempted.
A possible workaround is to ensure that the i array is always treated as a float within the _pmf method.
Here is the updated _pmf method with this fix:

from scipy.stats import rv_discrete
import numpy as np

class rv_exponential(rv_discrete):
    def _pmf(self, i):
        i = np.asarray(i, dtype=float)  # Ensure `i` is a float array
        print(i.dtype, i.shape)
        return (2/3)*3**(1 - i)

rv = rv_exponential(a=0.0, b=float('inf'))

print(rv.rvs())

Hope this helps
Thanks

oscarbenjamin · 2024-07-29T13:51:28Z

The workaround is not a trivial as it might seem. In context the code here is generated by a codeprinter from a symbolic expression in SymPy:

In [27]: from sympy import *

In [28]: i = Symbol('i')

In [29]: e = (S(2)/3)**3 * 3**(1 - i)

In [30]: e
Out[30]: 
   1 - i
8⋅3     
────────
   27   

In [31]: f = lambdify(i, e)

In [32]: f
Out[32]: <function _lambdifygenerated(i)>

In [34]: print(f.__doc__)
Created with lambdify. Signature:

func(i)

Expression:

8*3**(1 - i)/27

Source code:

def _lambdifygenerated(i):
    return (8/27)*3**(1 - i)


Imported modules:

The code printers would have to be modified to handle this somehow and there would need to be some UI for a caller of lambdify to say if they want this to happen or not.

In any case the first question is really whether it should be expected that the array would have an integer type at all. For the _pmf method it is expected that the argument values will be integers but apart from degenerate cases with 0 or 1 the function is basically guaranteed to return non-integer values so I don't think it makes sense to use an integer array here (although others may disagree). So far this change to integer type does not seem to be intentional in SciPy's case.

Siddharth-Latthe-07 · 2024-07-29T13:59:07Z

Given that the _pmf method is expected to handle integer inputs, but the function generated by lambdify from SymPy uses floating-point arithmetic, there is a need to bridge the gap between the expectations of integer inputs and the resulting floating-point computations.
To address the issue without having to manually convert the array in every instance of _pmf, we can take a two-pronged approach:

Modify the _pmf method to handle the type conversion transparently.
Investigate and consider reporting the potential unintended change in SciPy to maintain consistency.

Transparent Type Conversion in _pmf:
Here's how you can modify the _pmf method to ensure that it handles the input conversion internally. This will make sure that the method operates correctly regardless of whether the input is an integer array or not.

from scipy.stats import rv_discrete
import numpy as np

class rv_exponential(rv_discrete):
    def _pmf(self, i):
        i = np.asarray(i, dtype=float)  # Ensure `i` is a float array
        print(i.dtype, i.shape)
        return (2/3) * 3 ** (1 - i)

rv = rv_exponential(a=0.0, b=float('inf'))

print(rv.rvs())

While modifying the code printer in SymPy or providing a user interface in lambdify for type handling would be a more extensive change, the immediate issue can be mitigated by ensuring that the _pmf method correctly handles type conversion. This approach is less intrusive and maintains compatibility with existing code.

oscarbenjamin · 2024-07-29T15:26:56Z

@Siddharth-Latthe-07 I didn't want to say anything at first but your comments are clearly generated by an LLM with minimal editing and are not relevant to the discussion in this issue.

It might be reasonable for ChatGPT to offer this kind of advice for a novice user who wants to fix some simple code but that is not the situation here. I don't need ChatGPT to tell me how to modify the repro code to avoid this issue: I wrote that code deliberately so that it would demonstrate the issue.

There has been a change in NumPy which now means that SymPy's usage of a SciPy function does not work any more. A change is now needed in at least one of NumPy, SciPy or SymPy and considering what should be changed or not is the purpose of this and the related SciPy and SymPy issues. The first thing that I want to establish is if NumPy and SciPy are going to keep this changed behaviour.

mattip · 2024-07-29T15:55:40Z

@Siddharth-Latthe-07 any more of these useless comments and we will need to report you. See similar comments here and here. Many of us follow the git firehose for this repo, and do not appreciate getting spammed.

lucascolley · 2024-07-29T16:24:44Z

Feel free to close this issue, indeed SciPy just needs to adapt to the changed type promotion of floor 👍

oscarbenjamin · 2024-07-29T16:28:10Z

Thanks, I'll close this then.

oscarbenjamin added the 00 - Bug label Jul 26, 2024

oscarbenjamin mentioned this issue Jul 26, 2024

BUG: dtype changed for argument to rv_discrete._pmf scipy/scipy#21272

Closed

oscarbenjamin closed this as completed Jul 29, 2024

oscarbenjamin mentioned this issue Jul 31, 2024

Required checks on GitHub Actions sympy/sympy#26890

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: dtype changed from float64 to int64 in scipy discrete_rv #27054

BUG: dtype changed from float64 to int64 in scipy discrete_rv #27054

oscarbenjamin commented Jul 26, 2024

oscarbenjamin commented Jul 26, 2024

Uh oh!

oscarbenjamin commented Jul 26, 2024

Uh oh!

mtsokol commented Jul 28, 2024

Uh oh!

Siddharth-Latthe-07 commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!

Siddharth-Latthe-07 commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!

mattip commented Jul 29, 2024

Uh oh!

lucascolley commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!

Uh oh!

BUG: dtype changed from float64 to int64 in scipy discrete_rv #27054

BUG: dtype changed from float64 to int64 in scipy discrete_rv #27054

Comments

oscarbenjamin commented Jul 26, 2024

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

oscarbenjamin commented Jul 26, 2024

Uh oh!

oscarbenjamin commented Jul 26, 2024

Uh oh!

mtsokol commented Jul 28, 2024

Uh oh!

Siddharth-Latthe-07 commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!

Siddharth-Latthe-07 commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!

mattip commented Jul 29, 2024

Uh oh!

lucascolley commented Jul 29, 2024

Uh oh!

oscarbenjamin commented Jul 29, 2024

Uh oh!