Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Fix FP overflow error in division when the divisor is scalar #25129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 15, 2023

Conversation

seiko2plus
Copy link
Member

The bug occurred when SIMD partial load was involved,
due to filling remaining lanes of the dividend vector
with ones, which leads to raised overflow warnings
when the divisor is denormal.

This patch replaces the remaining lanes with nans rather
than ones to fix this issue.

closes #25097

@seiko2plus seiko2plus added 00 - Bug 09 - Backport-Candidate PRs tagged should be backported component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Nov 13, 2023
@mattip
Copy link
Member

mattip commented Nov 13, 2023

Is there a test we could add for this?

  The bug occurred when SIMD partial load was involved,
  due to filling remaining lanes of the dividend vector
  with ones, which leads to raised overflow warnings
  when the divisor is denormal.

  This patch replaces the remaining lanes with nans rather
  than ones to fix this issue.
@mattip
Copy link
Member

mattip commented Nov 14, 2023

The new test is emitting a warning on the armhf run

  The decision is based on the lack of native SIMD support for
  this operation in the armhf architecture, and the associated challenges
  in performance and evaluate the benefits of emulated SIMD intrinsic versus
  native scalar division.
@seiko2plus
Copy link
Member Author

The new test is emitting a warning on the armhf run

I ended up disabling the SIMD single-precision division optimization on armv7 due to the reasons described in the following C comment:

#if @is_div@ && defined(NPY_HAVE_NEON) && !NPY_SIMD_F64
/**
* The SIMD branch is disabled on armhf(armv7) due to the absence of native SIMD
* support for single-precision floating-point division. Only scalar division is
* supported natively, and without hardware for performance and accuracy comparison,
* it's challenging to evaluate the benefits of emulated SIMD intrinsic versus
* native scalar division.
*
* The `npyv_div_f32` universal intrinsic emulates the division operation using an
* approximate reciprocal combined with 3 Newton-Raphson iterations for enhanced
* precision. However, this approach has limitations:
*
* - It can cause unexpected floating-point overflows in special cases, such as when
* the divisor is subnormal (refer: https://github.com/numpy/numpy/issues/25097).
*
* - The precision may vary between the emulated SIMD and scalar division due to
* non-uniform branches (non-contiguous) in the code, leading to precision
* inconsistencies.
*
* - Considering the necessity of multiple Newton-Raphson iterations, the performance
* gain may not sufficiently offset these drawbacks.
*/
#elif @VECTOR@

@seberg
Copy link
Member

seberg commented Nov 15, 2023

Thanks, that comment is a bit of a monster there, but seems like we should put this in to fix the initial issue especially.

@seberg seberg merged commit 0419d0a into numpy:main Nov 15, 2023
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: "overflow encountered in divide" depending on the number of zeros
4 participants