BUG, TEST: Adding validation test suite to validate float32 exp #14048

r-devulap · 2019-07-17T21:31:43Z

Added a validation test suite to confirm accuracy of exp. While running this test suite, it caught a bug in AVX based exp function where denormals were being suppressed. There is a commit to fix that bug as well.
Working on adding validation tests for log, sin and cos as well.

r-devulap · 2019-07-17T21:39:07Z

The validation suite for exp consists of all the interesting float32 which can be categorized in the following way:

Positive denormals
Negative denormals
Float32 where output of expf is known to be a denormal
Floats that cause an overflow
Floats that cause an underflow
+/- Infinity
Top 60 floats that lose accuracy in the range reduction stage due to catastrophic cancellation.

seberg · 2019-07-17T21:39:52Z

Just curious, are those test values basically vendored in from somewhere else, in which case noting that might be nice to update it simpler?

r-devulap · 2019-07-17T21:48:22Z

The validation test numbers are something I generated myself based on where exponential function can potentially lose accuracy and needs to be verified. The criteria I ended up using is listed above.

r-devulap · 2019-07-17T23:55:57Z

The test requires to read this file numpy/core/tests/umath-validation-data/umath-validation-set-exp . What would be the safest way to include a path to this file in the test?

charris · 2019-07-18T00:09:46Z

Data files should go in numpy/core/tests/data.

charris · 2019-07-18T00:11:12Z

Please keep any fixes to current implementations separate, as they will need to be backported.

r-devulap · 2019-07-19T21:40:13Z

Added validation test suite for logf, sinf and cosf as well. Criteria for picking these float32 for testing sin/cos:

positive and negative denormals
+/-0.0, +/-inf +/-NAN, FLT_MIN, FLT_MAX
Random small-ish float32's (between -100.0f and 100.f)
Random large-ish float32's (between -1E-10f and 1E10.f)
Float32's where range reduction has a results in a large ULP error (due to catastrophic cancellation)
Top 30 Float32 values with largest ULP error (~1.49)
+/- N*PI/4 for N \in [1, 100]
+/- N*PI/2 for N \in [1, 100]
+/- N*PI for N \in [1, 100]

seberg

I cannot say that I can read it the SIMD instructions, am running a maxulp test locally as well, and it does pass after the fix and not before, though). Some minor comments about tests mostly.

seberg · 2019-07-19T22:28:23Z

numpy/core/src/umath/simd.inc.src

+     }
+     else {
+        __m256i exponent = _mm256_slli_epi32(_mm256_cvtps_epi32(quadrant), 23);
+        poly = _mm256_castsi256_ps(


Would this make sense to move it out so that it is obvious that the non-denormals use the same code above?

I'm sorry, could you please elaborate? not sure if I understand.

Processing denormals slows down the function a bit, the if-else is so that if there are no input that cause a denormal then we would not want to do all the extra work and it goes through the else condition.

Oh, I just meant the poly line/calculation basically, it looked to me like you can calculate that first and then just return if there are no denormals. But not sure it looks nicer.

seberg · 2019-07-19T22:29:19Z

numpy/core/src/umath/simd.inc.src

+     __m256 denormal_mask = _mm256_cmp_ps(quadrant, minquadrant, _CMP_LE_OQ);
+     if (_mm256_movemask_ps(denormal_mask) != 0x0000) {
+        __m256 quad_diff = _mm256_sub_ps(quadrant, minquadrant); // use negate
+        quad_diff = _mm256_sub_ps(_mm256_setzero_ps(), quad_diff); // make it +ve


Might add extra space before comment, but that is just silly/nitpicky. I have no idea what +ve is, although cannot say I can read this code easily in any case.

seberg · 2019-07-19T22:32:51Z

numpy/core/tests/test_umath_accuracy.py

+                data_dir = path.join(path.dirname(__file__), 'data')
+                filepath = path.join(data_dir, filename)
+                data = np.genfromtxt(filepath,
+                                     dtype=('|S39','|S39','|S39',np.int),


How about we make the validation file float32 specific? Then convert is just lambda x: int(x, 16), the dtype starts off with np.int16 and you can simply use inval = arr["inval_hex"].view(np.float32). I think that is easier to read (in the context of numpy users, also makes the dtype less strange to understand).

I was hoping to keep it generic for all dtypes, any particular reason to make it float32 specific?

I just preferred the view method somewhat, and that does not work well with multiple types. But it is not a big thing, and we can always restructure things a bit.

ok, I will leave it as it for now. We could re-factor later if needed.

seberg · 2019-07-19T22:35:25Z

numpy/core/tests/data/umath-validation-set-exp

+np.float32,0xc22920bd,0x2100003b,3
+np.float32,0xc2902caf,0x0b80011e,3
+np.float32,0xc1902cba,0x327fff2f,3
+np.float32,0xc2ca6625,0x00000008,3


I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is floor(2.xx) not ceil(2.xx)?

If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.

I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is floor(2.xx) not ceil(2.xx)?

Hmm, not sure about this. I was accounting for the worst case scenario. Max ULP measured for exp was 2.52 which means a rounding of nearest could push the ULP difference to 3.

If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.

My thinking was along the following lines: the file umath-validation-set-log would contain all the values we would want to test log function across all dtypes. So, in the future if we wanted to add tests for np.float16, then we could just add those values in this file with the corresponding ULP error and wouldn't need to edit the test_umath_accuracy.py file. But yes, a README or comments is surely warranted. I will add that.

OK, lets go with that for the moment, I am not sure the other way reads much nicer anyway, and as long as the files do not get huge, speed is not an issue as well.

sounds good

seberg · 2019-07-19T22:41:01Z

numpy/core/tests/test_umath_accuracy.py

+                                     skip_header=1)
+                npfunc = getattr(np, filename.split('-')[3])
+                for datatype in np.unique(data['type']):
+                    data_subset = data[ data['type'] == datatype ]


NIT: PEP8, no extra spaces (but I think this will likely vanish anyway).

fixed this.

numpy/core/src/umath/simd.inc.src

+     * 5) 2^(quad-125) can be computed by: 2 << abs(quad-125)
+     * 6) The final div operation generates the denormal
+     */
+     __m256 minquadrant = _mm256_set1_ps(-125.0f);


r-devulap · 2019-07-20T17:14:28Z

There are two tests failing:

sin validation fails on 32-bit windows
log validation fails on ARM64 target

I cannot reproduce these errors on x86 64 bit platforms. It is highly likely that sin results vary across platforms and max ULP error vary. I am not sure how to proceed.

charris · 2019-07-21T18:43:15Z

The tests on windows probably use the windows library, not sure what the problem is on arm64. might add an xfail(condition) decorator to the tests. How big are the errors. The "acceptable" errors in practice seem to range from 1.5-3.0 ulp. In general, if the errors aren't large, relaxing the test tolerance is acceptable.

seberg · 2019-07-22T18:57:37Z

Those 32-bit windows calls, do they even use the new code or are basically issues/less precision in the windows libc?

r-devulap · 2019-07-22T19:48:13Z

The float32 log failure is with this compiler: aarch64-linux-gnu-gcc which is a cross compiler for ARM64. So, pretty sure this isn't using the AVX code. The win-32 bit failures are for float32 sin function and that has to be issues with less precision on windows libc (because the AVX code for sin/cos isn't merged in NumPy yet).

r-devulap · 2019-07-22T19:49:09Z

I might just use xfail to disable these tests on ARM and 32-bit Win OS, if that is acceptable.

charris · 2019-07-22T20:49:00Z

xfail would be fine.

charris · 2019-07-22T20:57:58Z

Examples at https://pytest.readthedocs.io/en/latest/skipping.html#xfail-mark-test-functions-as-expected-to-fail.

…ndows

r-devulap · 2019-07-23T04:28:04Z

Seems like I was wrong about the win 32 bit. It is failing when compiled with i686-w64-mingw32 :(

seberg · 2019-07-23T21:30:03Z

@charris sorry, but do you happen to know what the right xfail check might be here (or who might)? I can try to look into it, but do not really have an idea right away.

r-devulap · 2019-07-23T21:41:53Z

@seberg @charris is it possible to some how export these macro's HAVE_TARGET_ATTRIBUTE_AVX2 and HAVE_TARGET_ATTRIBUTE_AVX512F in these tests? These macro's are generated by NumPy's build system to know if compiler can generate AVX instructions. If I can read them in the test, then I can use xfail for compilers that can't even generate AVX instructions.

seberg · 2019-07-23T21:47:38Z

I suppose you could add a function (or scalar) to the _multiarray_tests.c.src file, which returns this information to expose it to python. I am not quite sure if we have something like that elsewhere already.

Maybe we could just plan ahead and return a dictionary which can in principle fill in more such information as well.

charris · 2019-07-23T21:54:54Z

I'd like to export all the macro values at some point, put them in the __config__ attribute perhaps.

Maybe just make the xfail unconditional until we can gather more information. If nothing else, that will let us know what platforms it fails on.

seberg · 2019-07-23T22:31:10Z

Hmmm, ok. I am a bit scared that once it is marked as xfail it will just get lost and never run. I guess marking it too liberally (all windows?) might work to not lose it completely at least.

charris · 2019-07-23T23:08:25Z

@seberg We used to have a bunch of knownfails of complex branch cuts, maybe still do. I agree it is a bit drastic, but until we have a good way to know which platform/compiler/libc is failing at runtime I don't see what else we can do and at least it will give us a pass/fail result.

r-devulap · 2019-07-24T01:19:17Z

I'd like to export all the macro values at some point, put them in the __config__ attribute perhaps.

That is a great idea. For now, how about we run the test only on platform.machine == 'x86_64' and sys.platform.startswith('linux') ? That way, it at least runs on some of the machines that have AVX.

charris · 2019-07-24T02:12:10Z

That way, it at least runs on some of the machines that have AVX.

xfail always runs, reporting success or failure, but failure doesn't cause the test to fail. What you are proposing would have the same effect, but would be better at reporting errors. I could go either way but restricting the test platforms might be better at highlighting errors.

charris · 2019-07-24T20:37:32Z

numpy/core/tests/test_umath_accuracy.py

+                                   reason="""
+                                   stick to x86_64 and linux platforms.
+                                   test seems to fail on some of ARM and power
+                                   archictures.


Architectures :) Will fix.

Yikes! thank you :)

archictures -> architectures

charris · 2019-07-24T20:43:59Z

Thanks @r-devulap .

charris · 2019-07-25T03:15:54Z

The tests are failing in the nightly wheel builds, which use an old gcc version courtesy of manylinux1. See https://travis-ci.org/MacPython/numpy-wheels/jobs/563283722. It would probably be helpful to have a message that isolates the error so it can be fixed or worked around. Might need to check gcc version or just use xfail until we can improve things. Might also be an old library.

r-devulap · 2019-07-25T18:34:24Z

Will it print stuff for tests with xfail? And is there a docker container I can use to re-produce this fail? I tried compiler as old as gcc-4.8 and even then I can't get these to fail on my system.

charris · 2019-07-25T18:59:34Z

The manylinux1 is available as a docker image, not sure where. Things might also depend on the virtual machine that tests the built wheel.

The xfail normally reports the result but doesn't report details. That can be done by passing flags to pytest. Run python3 runtests.py ... -- -rxX, but I haven't tried that. I'm thinking we should add a marker for numeric tests so we can make them optional. Something like @pytest.mark.numeric, but that is for another day. The quick fix is just to make the tests xfail.

charris · 2019-07-25T22:28:35Z

Now looks like this:

charris@fc [numpy.git (mark-validation-tests-xfail)]$ python3 runtests.py -t numpy/core/tests/test_umath_accuracy.py -- -rxX 
Building, see build.log...
Build OK
NumPy version 1.18.0.dev0+3251dc2
NumPy relaxed strides checking option: True
X                                                                                                                                [100%]
======================================================= short test summary info ========================================================
XPASS numpy/core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_transcendentals Fails for MacPython/numpy-wheels builds
1 xpassed in 0.03 seconds

r-devulap · 2019-07-25T23:03:20Z

sounds good, I will try to find a way to debug the failures.

r-devulap mentioned this pull request Jul 17, 2019

BUG: float32 exp does not output denormals #14049

Closed

charris added 00 - Bug 05 - Testing 09 - Backport-Candidate PRs tagged should be backported component: numpy._core labels Jul 17, 2019

charris added this to the 1.17.0 release milestone Jul 17, 2019

r-devulap force-pushed the transcendental-accuracy-tests branch from 101a8a6 to 3fc378b Compare July 18, 2019 03:35

TEST: Adding tests to verify accuracy of math functions

09cffce

r-devulap changed the title ~~BUG, TEST: Adding validation test suite to validate exp.~~ BUG, TEST: Adding validation test suite to validate float32 exp Jul 18, 2019

r-devulap force-pushed the transcendental-accuracy-tests branch from 3fc378b to f316efb Compare July 19, 2019 04:04

BUG: fixing bug where AVX expf does not output denormals

f316efb

This was referenced Jul 19, 2019

ENH: Use AVX for float32 implementation of np.sin & np.cos #13368

Merged

ENH: confirm accuracy of basic math functions #13515

Closed

mattip requested a review from seberg July 19, 2019 22:04

seberg reviewed Jul 19, 2019

View reviewed changes

r-devulap force-pushed the transcendental-accuracy-tests branch 2 times, most recently from f7fa942 to ebc89d6 Compare July 20, 2019 16:34

TEST: adding validation test suite for float32 log, sin and cos

ebc89d6

r-devulap force-pushed the transcendental-accuracy-tests branch from 56b6ee4 to 742f3f1 Compare July 23, 2019 03:44

TEST: disable float32 exp/log/sin/cos validation in ARM and 32-bit wi…

742f3f1

…ndows

TEST: xfail if not x86_64 and linux

e2a352f

charris reviewed Jul 24, 2019

View reviewed changes

MAINT: Correct spelling.

e086003

archictures -> architectures

charris merged commit 96951ed into numpy:master Jul 24, 2019

charris mentioned this pull request Jul 25, 2019

MAINT: Mark umath accuracy test xfail. #14122

Merged

charris mentioned this pull request Jul 26, 2019

BUG, TEST: Adding validation test suite to validate float32 exp #14126

Merged

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jul 26, 2019

charris removed this from the 1.17.0 release milestone Jul 26, 2019

This was referenced Feb 6, 2020

ENH: improve runtime detection of CPU features #13421

Merged

TST: Accuracy test float32 sin/cos/exp/log for AVX platforms #15549

Merged

Uh oh!

BUG, TEST: Adding validation test suite to validate float32 exp #14048

BUG, TEST: Adding validation test suite to validate float32 exp #14048

Uh oh!

Conversation

r-devulap commented Jul 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

r-devulap commented Jul 17, 2019

Uh oh!

seberg commented Jul 17, 2019

Uh oh!

r-devulap commented Jul 17, 2019

Uh oh!

r-devulap commented Jul 17, 2019

Uh oh!

charris commented Jul 18, 2019

Uh oh!

charris commented Jul 18, 2019

Uh oh!

r-devulap commented Jul 19, 2019

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

r-devulap commented Jul 20, 2019

Uh oh!

charris commented Jul 21, 2019

Uh oh!

seberg commented Jul 22, 2019

Uh oh!

r-devulap commented Jul 22, 2019

Uh oh!

r-devulap commented Jul 22, 2019

Uh oh!

charris commented Jul 22, 2019

Uh oh!

charris commented Jul 22, 2019

Uh oh!

r-devulap commented Jul 23, 2019

Uh oh!

seberg commented Jul 23, 2019

Uh oh!

r-devulap commented Jul 23, 2019

Uh oh!

seberg commented Jul 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

r-devulap commented Jul 17, 2019 •

edited

Loading

seberg commented Jul 23, 2019 •

edited

Loading

charris commented Jul 25, 2019 •

edited

Loading