Thanks to visit codestin.com
Credit goes to github.com

Skip to content

None + np.longdouble(0) produces RecursionError (segfaults on ipython) #18548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aarchiba opened this issue Mar 5, 2021 · 10 comments · Fixed by #18691
Closed

None + np.longdouble(0) produces RecursionError (segfaults on ipython) #18548

aarchiba opened this issue Mar 5, 2021 · 10 comments · Fixed by #18691

Comments

@aarchiba
Copy link
Contributor

aarchiba commented Mar 5, 2021

Arithmetic with None and np.longdouble (None on the left) causes RecursionError (segfaults in ipython). Looks a little like #1296 but on the other side?

Reproducing code example:

import numpy as np
None + np.longdouble(0)

Error message:

$ python ../zot.py 
Traceback (most recent call last):
  File "../zot.py", line 2, in <module>
    None + np.longdouble(0)
RecursionError: maximum recursion depth exceeded while calling a Python object

NumPy/Python version information:

In [1]: import sys, numpy; print(numpy.__version__, sys.version)
1.19.1 3.8.5 (default, Jul 28 2020, 12:59:40) 
[GCC 9.3.0]
@aarchiba aarchiba changed the title None + np.longdouble(0) segfaults None + np.longdouble(0) produces RecursionError (segfaults on ipython) Mar 5, 2021
@aarchiba
Copy link
Contributor Author

aarchiba commented Mar 5, 2021

Sorry I found this with ipython/Jupyter notebooks, where it's a segfault; under plain python it's still a RecursionError, which is still unhelpful but at least catchable. If that's accepted behaviour please close the bug and I'll take it up with the ipython folks.

@seberg
Copy link
Member

seberg commented Mar 5, 2021

This is a longstanding bug, I was semi-aware that there are such corner cases with longdouble, not sure I knew that there was a decent chance of a segfault...
The scalar math code is a bit of a jungle and longdouble must not use the array fallback together with "object", it apparently does when object comes first.

@aarchiba
Copy link
Contributor Author

aarchiba commented Mar 6, 2021

I should say this arose because it was causing segmentation faults with unset parameters in PINT - so it is actually causing problems for users. We're necessarily going to go put some checks in place, but it's not just a theoretical problem. nanograv/PINT#993

@seberg
Copy link
Member

seberg commented Mar 6, 2021

OK, I had a bigger cleanup where I noticed this type of bugs, but then deferred because it was distracting me from the more urgent things. The only "workaround" you could do is implement the operators for longdouble yourself (or explicitly raise), but raising explicitly is weird...

I looked at it a bit now, and I think this is a cleanup that also fixes this issue:

diff --git a/numpy/core/src/umath/scalarmath.c.src b/numpy/core/src/umath/scalarmath.c.src
index 86dade0f1..394252cc8 100644
--- a/numpy/core/src/umath/scalarmath.c.src
+++ b/numpy/core/src/umath/scalarmath.c.src
@@ -624,6 +624,16 @@ _@name@_convert_to_ctype(PyObject *a, @type@ *arg1)
             Py_DECREF(descr1);
             return 0;
         }
+#if @TYPE@ == NPY_LONGDOUBLE || @TYPE@ == NPY_CLONGDOUBLE
+        else if (descr1->type_num == NPY_OBJECT) {
+            /*
+             * -3 indicates deferring. Other types get converted to their
+             * python version later, but longdouble cannot do that.
+             */
+            Py_DECREF(descr1);
+            return -3;
+        }
+#endif
         else {
             Py_DECREF(descr1);
             return -1;
@@ -638,7 +648,11 @@ _@name@_convert_to_ctype(PyObject *a, @type@ *arg1)
         Py_DECREF(temp);
         return retval;
     }
+#if @TYPE@ == NPY_LONGDOUBLE || @TYPE@ == NPY_CLONGDOUBLE
+    return -3;
+#else
     return -2;
+#endif
 }
 
 /**end repeat**/
@@ -704,33 +718,13 @@ _@name@_convert_to_ctype(PyObject *a, @type@ *arg1)
 /**begin repeat
  * #name = byte, ubyte, short, ushort, int, uint,
  *         long, ulong, longlong, ulonglong,
- *         half, float, double, cfloat, cdouble#
+ *         half, float, double, longdouble,
+ *         cfloat, cdouble, clongdouble#
  * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
  *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_half, npy_float, npy_double, npy_cfloat, npy_cdouble#
- */
-static int
-_@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
-                           PyObject *b, @type@ *arg2)
-{
-    int ret;
-    ret = _@name@_convert_to_ctype(a, arg1);
-    if (ret < 0) {
-        return ret;
-    }
-    ret = _@name@_convert_to_ctype(b, arg2);
-    if (ret < 0) {
-        return ret;
-    }
-    return 0;
-}
-/**end repeat**/
-
-/**begin repeat
- * #name = longdouble, clongdouble#
- * #type = npy_longdouble, npy_clongdouble#
+ *         npy_half, npy_float, npy_double, npy_longdouble,
+ *         npy_cfloat, npy_cdouble, npy_clongdouble#
  */
-
 static int
 _@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
                            PyObject *b, @type@ *arg2)
@@ -741,15 +735,11 @@ _@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
         return ret;
     }
     ret = _@name@_convert_to_ctype(b, arg2);
-    if (ret == -2) {
-        ret = -3;
-    }
     if (ret < 0) {
         return ret;
     }
     return 0;
 }
-
 /**end repeat**/

But it needs some careful additional longdouble/clongdouble tests (also mixing the two!) to make sure. Although I think this is a much clearer solution to begin with, the way this whole code defers is a mess (as I said). It should defer based on hierarchy: Always defer if the other type cannot cast. In fact, it would already be better if "reversed" the logic here, always defer, except for NPY_OBJECT when the dtype is not longdouble/clongdouble. But as I said, I had a larger try where I started that, which probably is a worthwhile restructure to begin with...

@aarchiba
Copy link
Contributor Author

aarchiba commented Mar 7, 2021

Do I understand that numpy uses hypothesis in its test suite? This might be a fairly painless way to test all the various data type/operator combinations? Even just pytest.mark.parametrize might be okay, at the risk of a combinatorial explosion.

@seberg
Copy link
Member

seberg commented Mar 7, 2021

@aarchiba yeah we use both (hypothesis not alot, but that is not because we are actively avoiding it). Combinatorial explosion is probably OK here, if it gets way slow, mark it as slow (Hypothesis also seems to have that tendency to try to use 1s per test, which is slow as well?). Fortunately, this code is numerical only that should make writing tests fairly managable.

Were you hoping for a backported fix, or even into making a PR? Scalar math makes me slightly nervous, so I would like to ensure good tests if we backport.

@aarchiba
Copy link
Contributor Author

aarchiba commented Mar 8, 2021

I don't think a backport is essential, though if the fix is easy I wouldn't complain. I'm willing to try my hand at a PR at some point but can't work on it right away - I don't know my way around the scalar math code at all. (I got lost in there trying to figure out how to add binary128 floats, a while ago.)

@aarchiba
Copy link
Contributor Author

I wrote a couple of hypothesis tests that check for the presence of this problem. Unfortunately the testing environment segfaults instead of giving a RecursionError, but I can consistently get crashes with op(object, scalar) and occasionally from op(scalar, object). Tests (and if I can, a fix) are in this branch: main...aarchiba:scalar_object_crash_fix

@aarchiba
Copy link
Contributor Author

I think I have fixed everything except complex long doubles when the modulo operator is applied. This shouldn't work, remainders are not defined in such contexts, but I'm not sure I can see where the infinite loop happens. Somewhere in the generated code in numpy/core/src/umath/scalarmath.c.src, probably because the .nb_remainder method is set to NULL; this may result in attempts to look up the other operand. One fix would be writing something that just raised a TypeError, though np.remainder(1,1j) suggests that a different casting rule might make a difference.

aarchiba added a commit to aarchiba/numpy that referenced this issue Mar 28, 2021
The operation None*np.longdouble(3) was causing infinite recursion as it
searched for the appropriate conversion method. This resolves that, both
for general operations and for remainders specifically (they fail in a
subtly different way).

Closes numpy#18548
@aarchiba
Copy link
Contributor Author

PR #18691 fixes this; there's a slightly questionable strategy for remainders for complex long doubles but otherwise it tests for the problem and fixes it.

charris pushed a commit to charris/numpy that referenced this issue Apr 13, 2021
The operation None*np.longdouble(3) was causing infinite recursion as it
searched for the appropriate conversion method. This resolves that, both
for general operations and for remainders specifically (they fail in a
subtly different way).

Closes numpy#18548
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants