-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG, TEST: Adding validation test suite to validate float32 exp #14048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG, TEST: Adding validation test suite to validate float32 exp #14048
Conversation
The validation suite for exp consists of all the interesting float32 which can be categorized in the following way:
|
Just curious, are those test values basically vendored in from somewhere else, in which case noting that might be nice to update it simpler? |
The validation test numbers are something I generated myself based on where exponential function can potentially lose accuracy and needs to be verified. The criteria I ended up using is listed above. |
The test requires to read this file |
Data files should go in |
Please keep any fixes to current implementations separate, as they will need to be backported. |
101a8a6
to
3fc378b
Compare
3fc378b
to
f316efb
Compare
Added validation test suite for logf, sinf and cosf as well. Criteria for picking these float32 for testing sin/cos:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot say that I can read it the SIMD instructions, am running a maxulp test locally as well, and it does pass after the fix and not before, though). Some minor comments about tests mostly.
} | ||
else { | ||
__m256i exponent = _mm256_slli_epi32(_mm256_cvtps_epi32(quadrant), 23); | ||
poly = _mm256_castsi256_ps( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this make sense to move it out so that it is obvious that the non-denormals use the same code above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, could you please elaborate? not sure if I understand.
Processing denormals slows down the function a bit, the if-else
is so that if there are no input that cause a denormal then we would not want to do all the extra work and it goes through the else
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just meant the poly
line/calculation basically, it looked to me like you can calculate that first and then just return if there are no denormals. But not sure it looks nicer.
numpy/core/src/umath/simd.inc.src
Outdated
__m256 denormal_mask = _mm256_cmp_ps(quadrant, minquadrant, _CMP_LE_OQ); | ||
if (_mm256_movemask_ps(denormal_mask) != 0x0000) { | ||
__m256 quad_diff = _mm256_sub_ps(quadrant, minquadrant); // use negate | ||
quad_diff = _mm256_sub_ps(_mm256_setzero_ps(), quad_diff); // make it +ve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might add extra space before comment, but that is just silly/nitpicky. I have no idea what +ve is, although cannot say I can read this code easily in any case.
data_dir = path.join(path.dirname(__file__), 'data') | ||
filepath = path.join(data_dir, filename) | ||
data = np.genfromtxt(filepath, | ||
dtype=('|S39','|S39','|S39',np.int), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we make the validation file float32 specific? Then convert
is just lambda x: int(x, 16)
, the dtype starts off with np.int16
and you can simply use inval = arr["inval_hex"].view(np.float32)
. I think that is easier to read (in the context of numpy users, also makes the dtype less strange to understand).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to keep it generic for all dtypes, any particular reason to make it float32 specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just preferred the view method somewhat, and that does not work well with multiple types. But it is not a big thing, and we can always restructure things a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will leave it as it for now. We could re-factor later if needed.
np.float32,0xc22920bd,0x2100003b,3 | ||
np.float32,0xc2902caf,0x0b80011e,3 | ||
np.float32,0xc1902cba,0x327fff2f,3 | ||
np.float32,0xc2ca6625,0x00000008,3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is floor(2.xx)
not ceil(2.xx)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am testing the ULP for exp. And I think after rounding the max ulp may actually 2? I.e. it is
floor(2.xx)
notceil(2.xx)
?
Hmm, not sure about this. I was accounting for the worst case scenario. Max ULP measured for exp was 2.52 which means a rounding of nearest could push the ULP difference to 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the maxulp is anyway all identical, I am not even sure I feel it is necessary to list them? But doesn't hurt I suppose. I think I would like if you add the comment that you posted here on top of the file(s). It is not like one can easily guess what the hex values are, or whether they are denormals.
My thinking was along the following lines: the file umath-validation-set-log
would contain all the values we would want to test log function across all dtypes. So, in the future if we wanted to add tests for np.float16, then we could just add those values in this file with the corresponding ULP error and wouldn't need to edit the test_umath_accuracy.py
file. But yes, a README or comments is surely warranted. I will add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, lets go with that for the moment, I am not sure the other way reads much nicer anyway, and as long as the files do not get huge, speed is not an issue as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
skip_header=1) | ||
npfunc = getattr(np, filename.split('-')[3]) | ||
for datatype in np.unique(data['type']): | ||
data_subset = data[ data['type'] == datatype ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: PEP8, no extra spaces (but I think this will likely vanish anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this.
* 5) 2^(quad-125) can be computed by: 2 << abs(quad-125) | ||
* 6) The final div operation generates the denormal | ||
*/ | ||
__m256 minquadrant = _mm256_set1_ps(-125.0f); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
f7fa942
to
ebc89d6
Compare
There are two tests failing:
I cannot reproduce these errors on x86 64 bit platforms. It is highly likely that sin results vary across platforms and max ULP error vary. I am not sure how to proceed. |
The tests on windows probably use the windows library, not sure what the problem is on arm64. might add an |
Those 32-bit windows calls, do they even use the new code or are basically issues/less precision in the windows libc? |
The float32 log failure is with this compiler: aarch64-linux-gnu-gcc which is a cross compiler for ARM64. So, pretty sure this isn't using the AVX code. The win-32 bit failures are for float32 sin function and that has to be issues with less precision on windows libc (because the AVX code for sin/cos isn't merged in NumPy yet). |
I might just use xfail to disable these tests on ARM and 32-bit Win OS, if that is acceptable. |
xfail would be fine. |
56b6ee4
to
742f3f1
Compare
Seems like I was wrong about the win 32 bit. It is failing when compiled with i686-w64-mingw32 :( |
@charris sorry, but do you happen to know what the right |
@seberg @charris is it possible to some how export these macro's HAVE_TARGET_ATTRIBUTE_AVX2 and HAVE_TARGET_ATTRIBUTE_AVX512F in these tests? These macro's are generated by NumPy's build system to know if compiler can generate AVX instructions. If I can read them in the test, then I can use xfail for compilers that can't even generate AVX instructions. |
I suppose you could add a function (or scalar) to the Maybe we could just plan ahead and return a dictionary which can in principle fill in more such information as well. |
I'd like to export all the macro values at some point, put them in the Maybe just make the xfail unconditional until we can gather more information. If nothing else, that will let us know what platforms it fails on. |
Hmmm, ok. I am a bit scared that once it is marked as |
@seberg We used to have a bunch of knownfails of complex branch cuts, maybe still do. I agree it is a bit drastic, but until we have a good way to know which platform/compiler/libc is failing at runtime I don't see what else we can do and at least it will give us a pass/fail result. |
That is a great idea. For now, how about we run the test only on |
xfail always runs, reporting success or failure, but failure doesn't cause the test to fail. What you are proposing would have the same effect, but would be better at reporting errors. I could go either way but restricting the test platforms might be better at highlighting errors. |
reason=""" | ||
stick to x86_64 and linux platforms. | ||
test seems to fail on some of ARM and power | ||
archictures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Architectures :) Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! thank you :)
archictures -> architectures
Thanks @r-devulap . |
The tests are failing in the nightly wheel builds, which use an old gcc version courtesy of manylinux1. See https://travis-ci.org/MacPython/numpy-wheels/jobs/563283722. It would probably be helpful to have a message that isolates the error so it can be fixed or worked around. Might need to check gcc version or just use xfail until we can improve things. Might also be an old library. |
Will it print stuff for tests with xfail? And is there a docker container I can use to re-produce this fail? I tried compiler as old as gcc-4.8 and even then I can't get these to fail on my system. |
The The |
Now looks like this:
|
sounds good, I will try to find a way to debug the failures. |
Uh oh!
There was an error while loading. Please reload this page.