-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: exp and log with non-contiguous float32 inputs #13512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Introduced in #13134, I assume. I do somewhat wonder if these optimizations are really worth it vs letting the compiler do its best instead. |
I do not know enough about compiler magic, but apparently we need to expand the tests when we add simd stuff... testing non-contiguous arrays (also with different offsets?) is far more important with simd, and apparently we are not that good yet (at least not for all ufuncs). Such a test should have existed. |
@r-devulap pinging you in case you did not see the issue. |
Taking a look. If you can help me figure out the test cases that we would like to cover, I can work on adding those to the test suite. |
@r-devulap thanks, seems Matti already is starting to work on it, but maybe you have time to check in. Well, as far as I understand, these SIMD operations need to take care how they load data, and if the beginning offset is not aligned to the registers they need special handling for the first bit of the loop? So Slicing into arrays to change offset and stride would be the thing to do? (Just things that can fail with SIMD, that is no issue/much less complicated for plain old for loops). |
I was actually looking at how sqrt handles this, and seems like SIMD implementation is used only when working with contiguous inputs. The function |
Fixed in #13520 |
It seems to me that the AVX routines of
np.exp
andnp.log
disregard thestrides
attribute of inputs.Reproducing code example:
output:
Numpy/Python version information:
The text was updated successfully, but these errors were encountered: