BUG: segfault when array with dtype=np.float32 is sliced then squared #25231

RoryMB · 2023-11-22T22:13:39Z

Describe the issue:

The code example creates a large array, sets the dtype to np.float32 and slices, then segfaults fairly consistently upon squaring the result.

Smaller array size values (e.g. 1024 * 1024 * 2, which produces only a 32MB structure) are less likely to segfault, but still crash often.

Things I tried that DID cause segfaults:
np.zeros((1024*1024*64, 2)).astype(np.float32)[:, 1]**2
np.zeros((1024*1024*2, 2)).astype(np.float32)[:, 1]**2
np.zeros((1024*1024*64, 2), dtype=np.float32)[:, 1]**2
np.ones((1024*1024*64, 2), dtype=np.float32)[:, 1]**2
np.zeros((1024*1024*64*2)).reshape((-1, 2)).astype(np.float32)[:, 1]**2

Things I tried that DID NOT cause segfaults:
np.zeros((1024*1024*64, 2)).astype(np.float32)[:, 0]**2
np.zeros((1024*1024*64, 2)).astype(np.float32)[:, 1]
np.zeros((1024*1024*64, 2)).astype(np.float32)**2
np.zeros((1024*1024*64, 2))[:, 1]**2

Reproduce the code example:

import numpy as np
np.zeros((1024 * 1024 * 64, 2)).astype(np.float32)[:, 1]**2

Error message:

zsh: segmentation fault  python sf.py

Runtime information:

M1 Max MacBook Pro
NumPy installed through pip install -U numpy

>>> numpy.show_runtime()
[{'numpy_version': '1.26.2',
  'python': '3.10.4 (v3.10.4:9d38120e33, Mar 23 2022, 17:29:05) [Clang 13.0.0 '
            '(clang-1300.0.29.30)]',
  'uname': uname_result(system='Darwin', node='MacBook-Pro.local', release='23.1.0', version='Darwin Kernel Version 23.1.0: Mon Oct  9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000', machine='arm64')},
 {'simd_extensions': {'baseline': ['NEON', 'NEON_FP16', 'NEON_VFPV4', 'ASIMD'],
                      'found': ['ASIMDHP'],
                      'not_found': ['ASIMDFHM']}},
 {'architecture': 'armv8',
  'filepath': '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/numpy/.dylibs/libopenblas64_.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 10,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.23.dev'}]

Context for the issue:

No response

The text was updated successfully, but these errors were encountered:

hvsesha · 2023-11-23T07:02:12Z

@RoryMB In windows i tried and we are not getting any segfault error .Any Advice

seberg · 2023-11-23T08:42:22Z

Thanks for the report! I am not immediately sure what is wrong. @seiko2plus can you have a quick look.
The slicing is important (you need to access the last element to trigger presumably. The size may be important because small arrays are allocated in arenas, so out-of-bound access won't trigger errors reliably.

The lldb backtrace seems pretty clear, I would suspect we access one element too many, but at this point it is still a guess.

>>> np.zeros((1024 * 1024 * 64, 2)).astype(np.float32)[:, 1]**2
Process 58393 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x2a0000000)
    frame #0: 0x0000000103dd7aa4 _multiarray_umath.cpython-310-darwin.so`FLOAT_square at memory.h:57:16 [opt]
   54  	{
   55  	    switch (stride) {
   56  	    case 2:
-> 57  	        return vld2q_s32((const int32_t*)ptr).val[0];
   58  	    case 3:
   59  	        return vld3q_s32((const int32_t*)ptr).val[0];
   60  	    case 4:
Target 0: (python) stopped.
warning: _multiarray_umath.cpython-310-darwin.so was compiled with optimization - stepping may behave oddly; variables may not be available.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x2a0000000)
  * frame #0: 0x0000000103dd7aa4 _multiarray_umath.cpython-310-darwin.so`FLOAT_square at memory.h:57:16 [opt]
    frame #1: 0x0000000103dd7aa4 _multiarray_umath.cpython-310-darwin.so`FLOAT_square [inlined] npyv_loadn_f32(ptr=<unavailable>, stride=2) at memory.h:81:9 [opt]
    frame #2: 0x0000000103dd7aa4 _multiarray_umath.cpython-310-darwin.so`FLOAT_square at loops_unary_fp.dispatch.c.src:130:35 [opt]

seberg · 2023-11-23T09:54:09Z

Ahh, squinting at it vld2q_s32 loads two vectors and de-interleaves them. So the result is ((x[0], x[2]), (x[1], x[3]))[0] giving x[0], x[2]. That is what we want for the strided access, here.

But, x[3] is potentially out-of-bound (unless we peel the loop). (Well, something like this anyway)

seiko2plus · 2023-11-24T08:35:58Z

Thank you @RoryMB, and @seberg for demonstrating this issue. Assuming the alignment of 32-bit stride over non-contiguous memory access is kind of naive, my bad. #25243 should fix this issue.

RoryMB added the 00 - Bug label Nov 22, 2023

seberg added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Nov 23, 2023

seberg added this to the 1.26.3 release milestone Nov 23, 2023

seiko2plus mentioned this issue Nov 24, 2023

BUG: Fix non-contiguous 32-bit memory load when ARM/Neon is enabled #25243

Merged

charris closed this as completed in #25243 Dec 1, 2023

charris mentioned this issue Dec 18, 2023

BUG: Fix non-contiguous memory load when ARM/Neon is enabled #25422

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: segfault when array with dtype=np.float32 is sliced then squared #25231

BUG: segfault when array with dtype=np.float32 is sliced then squared #25231

RoryMB commented Nov 22, 2023

hvsesha commented Nov 23, 2023

Uh oh!

seberg commented Nov 23, 2023

Uh oh!

seberg commented Nov 23, 2023 •

edited

Loading

Uh oh!

seiko2plus commented Nov 24, 2023

Uh oh!

Uh oh!

BUG: segfault when array with dtype=np.float32 is sliced then squared #25231

BUG: segfault when array with dtype=np.float32 is sliced then squared #25231

Comments

RoryMB commented Nov 22, 2023

Describe the issue:

Reproduce the code example:

Error message:

Runtime information:

Context for the issue:

hvsesha commented Nov 23, 2023

Uh oh!

seberg commented Nov 23, 2023

Uh oh!

seberg commented Nov 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seiko2plus commented Nov 24, 2023

Uh oh!

seberg commented Nov 23, 2023 •

edited

Loading