-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: fixing bugs in AVX exp/log while handling special value floats #13415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
edd5693
to
ad8ebeb
Compare
(1) Fixing invalid exception thrown for the new AVX version of exp (2) Special handling of +/-np.nan and +/-np.inf
numpy/core/src/umath/simd.inc.src
Outdated
@@ -1367,6 +1377,9 @@ static NPY_GCC_OPT_3 NPY_GCC_TARGET_@ISA@ void | |||
op += num_lanes; | |||
num_remaining_elements -= num_lanes; | |||
} | |||
|
|||
if (@mask_to_int@(overflow_mask)) | |||
_mm_setcsr(_mm_getcsr() | (0x1 << 3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a function for that: npy_set_floatstatus_overflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks :)
Here is the output for log:
|
I noticed that running it a second time does not generate the warnings, the MXCSR bits are not reset after each function call. Is that the expected behavior? |
@@ -1149,7 +1149,10 @@ avx2_get_exponent(__m256 x) | |||
|
|||
__m256 two_power_100 = _mm256_castsi256_ps(_mm256_set1_epi32(0x71800000)); | |||
__m256 denormal_mask = _mm256_cmp_ps(x, _mm256_set1_ps(FLT_MIN), _CMP_LT_OQ); | |||
__m256 temp = _mm256_mul_ps(x, two_power_100); | |||
__m256 normal_mask = _mm256_cmp_ps(x, _mm256_set1_ps(FLT_MIN), _CMP_GE_OQ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliantaylor, AVX512 provides a neat intrinsic that I use to extract mantissa and exponent in the AVX512 version. Do you know a better method than what I have in AVX2? (main challenge is handling denormals).
@r-devulap try setting |
thanks! that fixes it. :) |
I think this is in good shape now. I have also added more tests in the umath test module to ensure these don't happen again. |
Thanks @r-devulap |
for dt in ['f', 'd', 'g']: | ||
xf = np.array(x, dtype=dt) | ||
yf = np.array(y, dtype=dt) | ||
assert_equal(np.exp(yf), xf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is prone to failure on my mac locally:
__________________________________________________________________________________ TestSpecialFloats.test_exp_values ___________________________________________________________________________________
[gw0] darwin -- Python 3.6.5 /Users/treddy/miniconda3/envs/numpy_dev_py36/bin/python
self = <numpy.core.tests.test_umath.TestSpecialFloats object at 0x11ef99ba8>
def test_exp_values(self):
x = [np.nan, np.nan, np.inf, 0.]
y = [np.nan, -np.nan, np.inf, -np.inf]
for dt in ['f', 'd', 'g']:
xf = np.array(x, dtype=dt)
yf = np.array(y, dtype=dt)
> assert_equal(np.exp(yf), xf)
E RuntimeWarning: invalid value encountered in exp
dt = 'f'
self = <numpy.core.tests.test_umath.TestSpecialFloats object at 0x11ef99ba8>
x = [nan, nan, inf, 0.0]
xf = array([nan, nan, inf, 0.], dtype=float32)
y = [nan, nan, inf, -inf]
yf = array([ nan, nan, inf, -inf], dtype=float32)
numpy/core/tests/test_umath.py:659: RuntimeWarning
1 failed, 8142 passed, 63 skipped, 13 xfailed, 3 xpassed in 131.72 seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your macos setup significantly different to the azure CI one (clang/OS version)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
That means Xcode 10.2.1
according to these tables
sw_vers -productVersion
10.14.4
By contrast, Azure is 10.13
series OS and perhaps slightly different Xcode 10 minor version?
There is apparently now an option to bump Azure Mac CI to macOS-10.14
--not sure we want to deal with that right now, but easy experiment I suppose. It may be the Xcode version--there are docs about switching these things on Azure, depending on availability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know a fix for this? I am unable to reproduce this on my ubuntu and gcc/clang. one possible solution is: since that portion of test test_exp_values
only cares about verifying the output of exp, I can may be ignore floating point errors by updating the test with np.errstate(all='ignore')
. Would that be acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest the code explicitly mask the warnings after the call to the new functions via npy_clear_floatstatus_barrier((char*)dimensions)
, see the other uses in loops.c.src
. @tylerjereddy could you test that on your machine? If it helps, we could put it into a #ifdef WHATEVER_CLANG_WE_NEED block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that would lead to these portions of the tests failing though, right?
assert_raises(FloatingPointError, np.exp, np.float32(100.)) #overflow
assert_raises(FloatingPointError, np.exp, np.float32(1E19)) #overflow
assert_raises(FloatingPointError, np.log, np.float32(-np.inf)) #invalid
assert_raises(FloatingPointError, np.log, np.float32(-1.0)) #invalid
assert_raises(FloatingPointError, np.log, np.float32(0.)) #divide by zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you provide a git diff
on the patch you want me to try I can do that; probably even better to try reproducing in Azure MacOS on a fork or something, but in my experience that can take a few hours at least to iterate / get right.
I tried a quick bump of the Azure MacOS version alone to 10.14
, which is documented to use the same Xcode version I describe, but it didn't reproduce immediately. Probably needs playing around with some flags, clang version, or something else I haven't thought of yet..
* master: (25 commits) BUG: fix unravel_index when dimension is greater than 'intp' MANT: refactor unravel_index for code repetition (numpy#13446) BLD, TST: implicit func errors DOC: document existance of linalg backends BLD: streamlined library names in site.cfg sections (numpy#13157) MAINT: fixed typo 'wtihout' from numpy/core/shape_base.py BUG: fixing bugs in AVX exp/log while handling special value floats (numpy#13415) update sequence Add analysis check BUG: blindly add TypeError to accepted exceptions MAINT: fixed several PYTHONOPTIMIZE=2 failures BUG: fixed PYTHONOPTIMIZE run MAINT: fixed typo 'Mismacth' from numpy.core.setup_common.py MAINT: fixed last issues and questions according to numpy#13132 MAINT: improve efficiency of pad by avoiding use of apply_along_axis DOC: dimension sizes are non-negative, not positive BUILD, BUG: fix from review, fix bug in git_version MAINT: mention 'make dist' in error messsage BUILD: allow version-check to pass if GITVER is Unknown (sdist build) BUILD: fail documentation build if numpy version does not match ...
This fixes bugs for special value float32 handling. Fixes issue for exp filed in #13400. WIP for log, will submit a commit to this PR soon.
(1) Fixing invalid exception thrown for the new AVX version of exp
(2) Special handling of +/-np.nan and +/-np.inf
(3) arraysize for log ans exp is of type npy_intp rather than npy_int
The floating point exceptions now match glibc scalar implementation. Here is the output for exp: