Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: exp and log with non-contiguous float32 inputs #13512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
toslunar opened this issue May 9, 2019 · 7 comments
Closed

BUG: exp and log with non-contiguous float32 inputs #13512

toslunar opened this issue May 9, 2019 · 7 comments

Comments

@toslunar
Copy link
Contributor

toslunar commented May 9, 2019

It seems to me that the AVX routines of np.exp and np.log disregard the strides attribute of inputs.

Reproducing code example:

import numpy as np
print(np.exp(np.arange(4, dtype=np.float64)[::2]))  # correct
print(np.exp(np.arange(4, dtype=np.float32)[::2]))  # wrong
print(np.exp(np.arange(4, dtype=np.float16)[::2]))  # correct

output:

[1.        7.3890561]
[1.       2.718282]
[1.   7.39]

Numpy/Python version information:

1.17.0.dev0+634d66d 3.7.1 (default, Nov  6 2018, 21:02:07)
[Clang 10.0.0 (clang-1000.10.44.4)]
@eric-wieser
Copy link
Member

Introduced in #13134, I assume. I do somewhat wonder if these optimizations are really worth it vs letting the compiler do its best instead.

@eric-wieser eric-wieser added this to the 1.17.0 release milestone May 9, 2019
@seberg
Copy link
Member

seberg commented May 9, 2019

I do not know enough about compiler magic, but apparently we need to expand the tests when we add simd stuff... testing non-contiguous arrays (also with different offsets?) is far more important with simd, and apparently we are not that good yet (at least not for all ufuncs). Such a test should have existed.

@seberg
Copy link
Member

seberg commented May 9, 2019

@r-devulap pinging you in case you did not see the issue.

@r-devulap
Copy link
Member

Taking a look. If you can help me figure out the test cases that we would like to cover, I can work on adding those to the test suite.

@seberg
Copy link
Member

seberg commented May 9, 2019

@r-devulap thanks, seems Matti already is starting to work on it, but maybe you have time to check in. Well, as far as I understand, these SIMD operations need to take care how they load data, and if the beginning offset is not aligned to the registers they need special handling for the first bit of the loop? So Slicing into arrays to change offset and stride would be the thing to do? (Just things that can fail with SIMD, that is no issue/much less complicated for plain old for loops).

@r-devulap
Copy link
Member

I was actually looking at how sqrt handles this, and seems like SIMD implementation is used only when working with contiguous inputs. The function run_unary_simd_sqrt_FLOAT does this by using the check IS_BLOCKABLE_UNARY and then proceeds to the simd implementation. For all other cases sqrt is computed using in a scalar fashion. I think this is the right way to go, otherwise we can easily get into too many special cases which can hurt performance.

@mattip
Copy link
Member

mattip commented May 17, 2019

Fixed in #13520

@mattip mattip closed this as completed May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants