BUG: segmentation fault running `nan_to_num` on a 3D complex array #25959

scottstanie · 2024-03-07T19:39:01Z

Describe the issue:

I'm getting a segmentation fault using np.nan_to_num on a certain array. I'd like to attach it here somehow, but github doesn't let me attach binary files (it's about 10MB).

Reproduce the code example:

import numpy as np
block = np.load("nan_check.npy")
np.nan_to_num(block)

Error message:

>>> np.nan_to_num(block)
Segmentation fault: 11

Python and NumPy Versions:

>>> import sys, numpy; print(numpy.__version__); print(sys.version)
1.26.2
3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:37:07) [Clang 15.0.7 ]

Runtime Environment:

>>> import numpy; print(numpy.show_runtime())
[{'numpy_version': '1.26.2',
  'python': '3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:37:07) '
            '[Clang 15.0.7 ]',
  'uname': uname_result(system='Darwin', node='MT-317120', release='22.6.0', version='Darwin Kernel Version 22.6.0: Sun Dec 17 22:12:45 PST 2023; root:xnu-8796.141.3.703.2~2/RELEASE_ARM64_T6000', machine='arm64')},
 {'simd_extensions': {'baseline': ['NEON', 'NEON_FP16', 'NEON_VFPV4', 'ASIMD'],
                      'found': ['ASIMDHP'],
                      'not_found': ['ASIMDFHM']}},
 {'architecture': 'VORTEX',
  'filepath': '/Users/staniewi/miniconda3/envs/mapping-311/lib/libopenblas.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 10,
  'prefix': 'libopenblas',
  'threading_layer': 'openmp',
  'user_api': 'blas',
  'version': '0.3.25'},
 {'filepath': '/Users/staniewi/miniconda3/envs/mapping-311/lib/libomp.dylib',
  'internal_api': 'openmp',
  'num_threads': 10,
  'prefix': 'libomp',
  'user_api': 'openmp',
  'version': None}]
None

Context for the issue:

I have tried narrowing down the array to something as small as possible, but when I limit to subsets, the segfault goes away.

I've also tried just creating something with np.full(shape, 1j * np.nan), but that also doesn't error.

I ran $ xxd nan_check_smaller.bin binary_contents.txt and tried to see if there was some malformed number, but all the data in the array looks to have the same '0000 c0ff 0000 c07f 0000 c0ff 0000 c07f' content.

I'm running this on an M1 Macbook with numpy 1.26. If I install another environment with 1.24.4, I don't get the segfault.

The text was updated successfully, but these errors were encountered:

ngoldbaum · 2024-03-07T22:50:37Z

Can you run python under faulthandler and/or lldb to get a traceback for the segfault?

scottstanie · 2024-03-07T23:51:59Z

$ cat test_nan.py
import numpy as np
np.nan_to_num(np.load('nan_check_smaller.npy'))
$ python test_nan.py
Segmentation fault: 11

$ lldb python test_nan.py
(lldb) run
Process 23842 launched: '/Users/staniewi/miniconda3/envs/mapping-311/bin/python' (arm64)
Process 23842 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x157f1c000)
    frame #0: 0x0000000106c93458 _multiarray_umath.cpython-311-darwin.so`FLOAT_isnan + 704
_multiarray_umath.cpython-311-darwin.so`FLOAT_isnan:
->  0x106c93458 <+704>: ld2.4s { v24, v25 }, [x21]
    0x106c9345c <+708>: b      0x106c933f0               ; <+600>
    0x106c93460 <+712>: mov    x21, x0
    0x106c93464 <+716>: ld4.4s { v1, v2, v3, v4 }, [x21], x16
Target 0: (python) stopped.

scottstanie · 2024-03-07T23:53:01Z

 python -q -X faulthandler test_nan.py
Fatal Python error: Segmentation fault

Current thread 0x0000000201b76100 (most recent call first):
  File "/Users/staniewi/miniconda3/envs/mapping-311/lib/python3.11/site-packages/numpy/lib/type_check.py", line 514 in nan_to_num
  File "/Users/staniewi/Documents/Learning/OPERA/2024-02-gamma-delivery/smaller/test_nan.py", line 2 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator (total: 13)
Segmentation fault: 11

ngoldbaum · 2024-03-07T23:59:57Z

In lldb, execute bt to get a traceback. But seeing that the crash is in isnan is a good clue.

scottstanie · 2024-03-08T00:00:49Z

sorry about that! I haven't used lldb before:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x14ea8c000)
  * frame #0: 0x00000001051c7458 _multiarray_umath.cpython-311-darwin.so`FLOAT_isnan + 704
    frame #1: 0x0000000105174b38 _multiarray_umath.cpython-311-darwin.so`generic_wrapped_legacy_loop + 40
    frame #2: 0x000000010517c908 _multiarray_umath.cpython-311-darwin.so`execute_ufunc_loop + 1240
    frame #3: 0x000000010517a55c _multiarray_umath.cpython-311-darwin.so`PyUFunc_GenericFunctionInternal + 2604
    frame #4: 0x0000000105178b00 _multiarray_umath.cpython-311-darwin.so`ufunc_generic_fastcall + 3056
    frame #5: 0x00000001000600e0 python`PyObject_Vectorcall + 76
    frame #6: 0x000000010015febc python`_PyEval_EvalFrameDefault + 47116
    frame #7: 0x0000000100164078 python`_PyEval_Vector + 184
    frame #8: 0x00000001000600e0 python`PyObject_Vectorcall + 76
    frame #9: 0x00000001050a79d4 _multiarray_umath.cpython-311-darwin.so`dispatcher_vectorcall + 564
    frame #10: 0x00000001000600e0 python`PyObject_Vectorcall + 76
    frame #11: 0x000000010015febc python`_PyEval_EvalFrameDefault + 47116
    frame #12: 0x000000010015372c python`PyEval_EvalCode + 220
    frame #13: 0x00000001001b953c python`run_mod + 144
    frame #14: 0x00000001001b8f9c python`_PyRun_SimpleFileObject + 1264
    frame #15: 0x00000001001b8058 python`_PyRun_AnyFileObject + 240
    frame #16: 0x00000001001de8f4 python`Py_RunMain + 3128
    frame #17: 0x00000001001df788 python`pymain_main + 1312
    frame #18: 0x0000000100003628 python`main + 56
    frame #19: 0x00000001a60cbf28 dyld`start + 2236

ngoldbaum · 2024-03-08T00:02:53Z

No worries, remote debugging is fun. If you can share the problematic file via some other means, that would help too. npy files that don’t contain inline pickle files are safe to share.

scottstanie · 2024-03-08T00:10:53Z

Ah right, I realized it's not hard to just make a repo for it. Here you go: https://github.com/scottstanie/numpy-nan-to-num-debug/blob/main/test_nan.py

seberg · 2024-03-08T07:02:40Z

Can you share the output arr.__array_interface__ for a crash? Although I think uploading it would be much better. Try compressing it, it might make the file very small (it sounds a bit like it might contain mostly one value).

FWIW, the value you mentioned: value = np.uint64(0x0000c0ff0000c07f).view(np.float64) is 1.048414460685683e-309 a denormal number, which I guess might be related but wasn't enough to repro for me, although I didn't try with the conda-forge build.

scottstanie · 2024-03-08T15:10:11Z

Can you share the output arr.__array_interface__ for a crash? Although I think uploading it would be much better. Try compressing it, it might make the file very small (it sounds a bit like it might contain mostly one value).

ah you're right that i should have just zipped it:
nan_check_smaller.npy.zip

FWIW, the value you mentioned: value = np.uint64(0x0000c0ff0000c07f).view(np.float64) is 1.048414460685683e-309 a denormal number

Just noting that the numbers are supposed to be nan + nanj as complex64.
Also that I didn't reproduce the crash running on linux, only Mac.

seberg · 2024-03-08T15:18:51Z

Thanks, I can reproduce this with 1.26.2, but not with 1.26.4. So I suspect you should simply upgrade and this is probably a fixed SIMD issue. Although, unfortunately, I can't say I know which PR would have fixed it.

supposed to be nan + nanj

Ah, thought they were complex128. Could be that these are slightly odd NaN values (as they differ for real and imaginary part, but didn't check).

scottstanie · 2024-03-08T15:22:20Z

Thanks for checking! Yes I was able to solve this for my own purposes by upgrading. I only reported to help keep the problem from creeping back into future versions

seberg · 2024-03-08T15:32:42Z

Aha, nan_to_num operates on real and imaginary part seperatly, so it is the same as slicing and looks like #25243, closing.

Thanks for the report, though.

scottstanie added the 00 - Bug label Mar 7, 2024

seberg added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Mar 8, 2024

seberg closed this as completed Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: segmentation fault running `nan_to_num` on a 3D complex array #25959

BUG: segmentation fault running `nan_to_num` on a 3D complex array #25959

scottstanie commented Mar 7, 2024

ngoldbaum commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 7, 2024

Uh oh!

ngoldbaum commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

ngoldbaum commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

Uh oh!

BUG: segmentation fault running nan_to_num on a 3D complex array #25959

BUG: segmentation fault running nan_to_num on a 3D complex array #25959

Comments

scottstanie commented Mar 7, 2024

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

ngoldbaum commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 7, 2024

Uh oh!

ngoldbaum commented Mar 7, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

ngoldbaum commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

scottstanie commented Mar 8, 2024

Uh oh!

seberg commented Mar 8, 2024

Uh oh!

BUG: segmentation fault running `nan_to_num` on a 3D complex array #25959

BUG: segmentation fault running `nan_to_num` on a 3D complex array #25959