BUG: exp and log with non-contiguous float32 inputs #13512

toslunar · 2019-05-09T06:13:53Z

It seems to me that the AVX routines of np.exp and np.log disregard the strides attribute of inputs.

Reproducing code example:

import numpy as np
print(np.exp(np.arange(4, dtype=np.float64)[::2]))  # correct
print(np.exp(np.arange(4, dtype=np.float32)[::2]))  # wrong
print(np.exp(np.arange(4, dtype=np.float16)[::2]))  # correct

output:

[1.        7.3890561]
[1.       2.718282]
[1.   7.39]

Numpy/Python version information:

1.17.0.dev0+634d66d 3.7.1 (default, Nov  6 2018, 21:02:07)
[Clang 10.0.0 (clang-1000.10.44.4)]

The text was updated successfully, but these errors were encountered:

eric-wieser · 2019-05-09T06:45:08Z

Introduced in #13134, I assume. I do somewhat wonder if these optimizations are really worth it vs letting the compiler do its best instead.

seberg · 2019-05-09T07:15:45Z

I do not know enough about compiler magic, but apparently we need to expand the tests when we add simd stuff... testing non-contiguous arrays (also with different offsets?) is far more important with simd, and apparently we are not that good yet (at least not for all ufuncs). Such a test should have existed.

seberg · 2019-05-09T07:17:22Z

@r-devulap pinging you in case you did not see the issue.

r-devulap · 2019-05-09T15:25:06Z

Taking a look. If you can help me figure out the test cases that we would like to cover, I can work on adding those to the test suite.

seberg · 2019-05-09T18:23:24Z

@r-devulap thanks, seems Matti already is starting to work on it, but maybe you have time to check in. Well, as far as I understand, these SIMD operations need to take care how they load data, and if the beginning offset is not aligned to the registers they need special handling for the first bit of the loop? So Slicing into arrays to change offset and stride would be the thing to do? (Just things that can fail with SIMD, that is no issue/much less complicated for plain old for loops).

r-devulap · 2019-05-09T18:31:20Z

I was actually looking at how sqrt handles this, and seems like SIMD implementation is used only when working with contiguous inputs. The function run_unary_simd_sqrt_FLOAT does this by using the check IS_BLOCKABLE_UNARY and then proceeds to the simd implementation. For all other cases sqrt is computed using in a scalar fashion. I think this is the right way to go, otherwise we can easily get into too many special cases which can hurt performance.

mattip · 2019-05-17T03:24:22Z

Fixed in #13520

eric-wieser added this to the 1.17.0 release milestone May 9, 2019

seberg added 00 - Bug 06 - Regression labels May 9, 2019

mattip mentioned this issue May 9, 2019

WIP, BUG: exp, log AVX loops do not use steps #13517

Closed

mattip closed this as completed May 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: exp and log with non-contiguous float32 inputs #13512

BUG: exp and log with non-contiguous float32 inputs #13512

toslunar commented May 9, 2019

eric-wieser commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

r-devulap commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

r-devulap commented May 9, 2019

Uh oh!

mattip commented May 17, 2019

Uh oh!

Uh oh!

BUG: exp and log with non-contiguous float32 inputs #13512

BUG: exp and log with non-contiguous float32 inputs #13512

Comments

toslunar commented May 9, 2019

Reproducing code example:

Numpy/Python version information:

eric-wieser commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

r-devulap commented May 9, 2019

Uh oh!

seberg commented May 9, 2019

Uh oh!

r-devulap commented May 9, 2019

Uh oh!

mattip commented May 17, 2019

Uh oh!