Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG, TEST: Adding validation test suite to validate float32 exp #14048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 24, 2019

Conversation

r-devulap
Copy link
Member

@r-devulap r-devulap commented Jul 17, 2019

  1. Added a validation test suite to confirm accuracy of exp. While running this test suite, it caught a bug in AVX based exp function where denormals were being suppressed. There is a commit to fix that bug as well.
  2. Working on adding validation tests for log, sin and cos as well.

@r-devulap
Copy link
Member Author

The validation suite for exp consists of all the interesting float32 which can be categorized in the following way:

  1. Positive denormals
  2. Negative denormals
  3. Float32 where output of expf is known to be a denormal
  4. Floats that cause an overflow
  5. Floats that cause an underflow
  6. +/- Infinity
  7. Top 60 floats that lose accuracy in the range reduction stage due to catastrophic cancellation.

@seberg
Copy link
Member

seberg commented Jul 17, 2019

Just curious, are those test values basically vendored in from somewhere else, in which case noting that might be nice to update it simpler?

@r-devulap
Copy link
Member Author

The validation test numbers are something I generated myself based on where exponential function can potentially lose accuracy and needs to be verified. The criteria I ended up using is listed above.

@charris charris added this to the 1.17.0 release milestone Jul 17, 2019
@r-devulap
Copy link
Member Author

The test requires to read this file numpy/core/tests/umath-validation-data/umath-validation-set-exp . What would be the safest way to include a path to this file in the test?

@charris
Copy link
Member

charris commented Jul 18, 2019

Data files should go in numpy/core/tests/data.

@charris
Copy link
Member

charris commented Jul 18, 2019

Please keep any fixes to current implementations separate, as they will need to be backported.

@r-devulap r-devulap force-pushed the transcendental-accuracy-tests branch from 101a8a6 to 3fc378b Compare July 18, 2019 03:35
@r-devulap r-devulap changed the title BUG, TEST: Adding validation test suite to validate exp. BUG, TEST: Adding validation test suite to validate float32 exp Jul 18, 2019
@r-devulap r-devulap force-pushed the transcendental-accuracy-tests branch from 3fc378b to f316efb Compare July 19, 2019 04:04
@r-devulap
Copy link
Member Author

Added validation test suite for logf, sinf and cosf as well. Criteria for picking these float32 for testing sin/cos:

  1. positive and negative denormals
  2. +/-0.0, +/-inf +/-NAN, FLT_MIN, FLT_MAX
  3. Random small-ish float32's (between -100.0f and 100.f)
  4. Random large-ish float32's (between -1E-10f and 1E10.f)
  5. Float32's where range reduction has a results in a large ULP error (due to catastrophic cancellation)
  6. Top 30 Float32 values with largest ULP error (~1.49)
  7. +/- N*PI/4 for N \in [1, 100]
  8. +/- N*PI/2 for N \in [1, 100]
  9. +/- N*PI for N \in [1, 100]

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot say that I can read it the SIMD instructions, am running a maxulp test locally as well, and it does pass after the fix and not before, though). Some minor comments about tests mostly.

}
else {
__m256i exponent = _mm256_slli_epi32(_mm256_cvtps_epi32(quadrant), 23);
poly = _mm256_castsi256_ps(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this make sense to move it out so that it is obvious that the non-denormals use the same code above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, could you please elaborate? not sure if I understand.

Processing denormals slows down the function a bit, the if-else is so that if there are no input that cause a denormal then we would not want to do all the extra work and it goes through the else condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just meant the poly line/calculation basically, it looked to me like you can calculate that first and then just return if there are no denormals. But not sure it looks nicer.

__m256 denormal_mask = _mm256_cmp_ps(quadrant, minquadrant, _CMP_LE_OQ);
if (_mm256_movemask_ps(denormal_mask) != 0x0000) {
__m256 quad_diff = _mm256_sub_ps(quadrant, minquadrant); // use negate
quad_diff = _mm256_sub_ps(_mm256_setzero_ps(), quad_diff); // make it +ve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might add extra space before comment, but that is just silly/nitpicky. I have no idea what +ve is, although cannot say I can read this code easily in any case.

data_dir = path.join(path.dirname(__file__), 'data')
filepath = path.join(data_dir, filename)
data = np.genfromtxt(filepath,
dtype=('|S39','|S39','|S39',np.int),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we make the validation file float32 specific? Then convert is just lambda x: int(x, 16), the dtype starts off with np.int16 and you can simply use inval = arr["inval_hex"].view(np.float32). I think that is easier to read (in the context of numpy users, also makes the dtype less strange to understand).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to keep it generic for all dtypes, any particular reason to make it float32 specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just preferred the view method somewhat, and that does not work well with multiple types. But it is not a big thing, and we can always restructure things a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will leave it as it for now. We could re-factor later if needed.

np.float32,0xc22920bd,0x2100003b,3
np.float32,0xc2902caf,0x0b80011e,3
np.float32,0xc1902cba,0x327fff2f,3
np.float32,0xc2ca6625,0x00000008,3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is floor(2.xx) not ceil(2.xx)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is floor(2.xx) not ceil(2.xx)?

Hmm, not sure about this. I was accounting for the worst case scenario. Max ULP measured for exp was 2.52 which means a rounding of nearest could push the ULP difference to 3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.

My thinking was along the following lines: the file umath-validation-set-log would contain all the values we would want to test log function across all dtypes. So, in the future if we wanted to add tests for np.float16, then we could just add those values in this file with the corresponding ULP error and wouldn't need to edit the test_umath_accuracy.py file. But yes, a README or comments is surely warranted. I will add that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, lets go with that for the moment, I am not sure the other way reads much nicer anyway, and as long as the files do not get huge, speed is not an issue as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

skip_header=1)
npfunc = getattr(np, filename.split('-')[3])
for datatype in np.unique(data['type']):
data_subset = data[ data['type'] == datatype ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: PEP8, no extra spaces (but I think this will likely vanish anyway).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this.

* 5) 2^(quad-125) can be computed by: 2 << abs(quad-125)
* 6) The final div operation generates the denormal
*/
__m256 minquadrant = _mm256_set1_ps(-125.0f);

This comment was marked as outdated.

@r-devulap r-devulap force-pushed the transcendental-accuracy-tests branch 2 times, most recently from f7fa942 to ebc89d6 Compare July 20, 2019 16:34
@r-devulap
Copy link
Member Author

There are two tests failing:

  1. sin validation fails on 32-bit windows
  2. log validation fails on ARM64 target

I cannot reproduce these errors on x86 64 bit platforms. It is highly likely that sin results vary across platforms and max ULP error vary. I am not sure how to proceed.

@charris
Copy link
Member

charris commented Jul 21, 2019

The tests on windows probably use the windows library, not sure what the problem is on arm64. might add an xfail(condition) decorator to the tests. How big are the errors. The "acceptable" errors in practice seem to range from 1.5-3.0 ulp. In general, if the errors aren't large, relaxing the test tolerance is acceptable.

@seberg
Copy link
Member

seberg commented Jul 22, 2019

Those 32-bit windows calls, do they even use the new code or are basically issues/less precision in the windows libc?

@r-devulap
Copy link
Member Author

The float32 log failure is with this compiler: aarch64-linux-gnu-gcc which is a cross compiler for ARM64. So, pretty sure this isn't using the AVX code. The win-32 bit failures are for float32 sin function and that has to be issues with less precision on windows libc (because the AVX code for sin/cos isn't merged in NumPy yet).

@r-devulap
Copy link
Member Author

I might just use xfail to disable these tests on ARM and 32-bit Win OS, if that is acceptable.

@charris
Copy link
Member

charris commented Jul 22, 2019

xfail would be fine.

@charris
Copy link
Member

charris commented Jul 22, 2019

@r-devulap r-devulap force-pushed the transcendental-accuracy-tests branch from 56b6ee4 to 742f3f1 Compare July 23, 2019 03:44
@r-devulap
Copy link
Member Author

Seems like I was wrong about the win 32 bit. It is failing when compiled with i686-w64-mingw32 :(

@seberg
Copy link
Member

seberg commented Jul 23, 2019

@charris sorry, but do you happen to know what the right xfail check might be here (or who might)? I can try to look into it, but do not really have an idea right away.

@r-devulap
Copy link
Member Author

@seberg @charris is it possible to some how export these macro's HAVE_TARGET_ATTRIBUTE_AVX2 and HAVE_TARGET_ATTRIBUTE_AVX512F in these tests? These macro's are generated by NumPy's build system to know if compiler can generate AVX instructions. If I can read them in the test, then I can use xfail for compilers that can't even generate AVX instructions.

@seberg
Copy link
Member

seberg commented Jul 23, 2019

I suppose you could add a function (or scalar) to the _multiarray_tests.c.src file, which returns this information to expose it to python. I am not quite sure if we have something like that elsewhere already.

Maybe we could just plan ahead and return a dictionary which can in principle fill in more such information as well.

@charris
Copy link
Member

charris commented Jul 23, 2019

I'd like to export all the macro values at some point, put them in the __config__ attribute perhaps.

Maybe just make the xfail unconditional until we can gather more information. If nothing else, that will let us know what platforms it fails on.

@seberg
Copy link
Member

seberg commented Jul 23, 2019

Hmmm, ok. I am a bit scared that once it is marked as xfail it will just get lost and never run. I guess marking it too liberally (all windows?) might work to not lose it completely at least.

@charris
Copy link
Member

charris commented Jul 23, 2019

@seberg We used to have a bunch of knownfails of complex branch cuts, maybe still do. I agree it is a bit drastic, but until we have a good way to know which platform/compiler/libc is failing at runtime I don't see what else we can do and at least it will give us a pass/fail result.

@r-devulap
Copy link
Member Author

I'd like to export all the macro values at some point, put them in the __config__ attribute perhaps.

That is a great idea. For now, how about we run the test only on platform.machine == 'x86_64' and sys.platform.startswith('linux') ? That way, it at least runs on some of the machines that have AVX.

@charris
Copy link
Member

charris commented Jul 24, 2019

That way, it at least runs on some of the machines that have AVX.

xfail always runs, reporting success or failure, but failure doesn't cause the test to fail. What you are proposing would have the same effect, but would be better at reporting errors. I could go either way but restricting the test platforms might be better at highlighting errors.

reason="""
stick to x86_64 and linux platforms.
test seems to fail on some of ARM and power
archictures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectures :) Will fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! thank you :)

archictures -> architectures
@charris charris merged commit 96951ed into numpy:master Jul 24, 2019
@charris
Copy link
Member

charris commented Jul 24, 2019

Thanks @r-devulap .

@charris
Copy link
Member

charris commented Jul 25, 2019

The tests are failing in the nightly wheel builds, which use an old gcc version courtesy of manylinux1. See https://travis-ci.org/MacPython/numpy-wheels/jobs/563283722. It would probably be helpful to have a message that isolates the error so it can be fixed or worked around. Might need to check gcc version or just use xfail until we can improve things. Might also be an old library.

@r-devulap
Copy link
Member Author

Will it print stuff for tests with xfail? And is there a docker container I can use to re-produce this fail? I tried compiler as old as gcc-4.8 and even then I can't get these to fail on my system.

@charris
Copy link
Member

charris commented Jul 25, 2019

The manylinux1 is available as a docker image, not sure where. Things might also depend on the virtual machine that tests the built wheel.

The xfail normally reports the result but doesn't report details. That can be done by passing flags to pytest. Run python3 runtests.py ... -- -rxX, but I haven't tried that. I'm thinking we should add a marker for numeric tests so we can make them optional. Something like @pytest.mark.numeric, but that is for another day. The quick fix is just to make the tests xfail.

@charris
Copy link
Member

charris commented Jul 25, 2019

Now looks like this:

charris@fc [numpy.git (mark-validation-tests-xfail)]$ python3 runtests.py -t numpy/core/tests/test_umath_accuracy.py -- -rxX 
Building, see build.log...
Build OK
NumPy version 1.18.0.dev0+3251dc2
NumPy relaxed strides checking option: True
X                                                                                                                                [100%]
======================================================= short test summary info ========================================================
XPASS numpy/core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_transcendentals Fails for MacPython/numpy-wheels builds
1 xpassed in 0.03 seconds

@r-devulap
Copy link
Member Author

sounds good, I will try to find a way to debug the failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants