You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our iterator avoids buffers whenever possible, this is the case as usually accessing data with strides is faster than copying to a buffer for a contigous loop. This is also the case for our vectorized basic math functions (+-*/).
Since we merged vectorized exp and log we have vectorized ufuncs that are heavily cpu bound instead of memory bound.
For these functions it can be very worthwhile to use buffered iterators as only then the vectorized code can be used.
Another advantage is that it we guarantee the same results regardless of strides of the input data. (A property that in current master might be lost but could also be restored without a buffered iterator by using the vector code on scalar data though that might be a bit slower)
import numpy as np
d = np.random.rand(1000, 50).astype(np.float32)
print("2d strided -> buffered iterator")
%timeit np.exp(d[::2])
d = np.random.rand(1000* 50).astype(np.float32)
print("1d strided unbuffered iterator")
%timeit np.exp(d[::2])
2d strided -> buffered iterator
10000 loops, best of 3: 77.3 µs per loop
1d strided unbuffered iterator
1000 loops, best of 3: 1.02 ms per loop
Btw vectorizing the functions for strided data adds complexity and would likely be slower than our buffer approach. I would not recommend pursuing it.
The text was updated successfully, but these errors were encountered:
which should allow to force NPY_ITER_CONTIG for the op_flags, I think. I am not sure. I think the original purpose was something else, so I am not sure that this actually fits here. There may be tricky things with reduce, accumulate, and reduceat compared to the normal loop. As well as issues surrounding the READWRITE flag which may be needed for reductions.
seberg
changed the title
always use buffered iterator for vectorized math
ENH: always use buffered iterator for vectorized math
May 14, 2019
Our iterator avoids buffers whenever possible, this is the case as usually accessing data with strides is faster than copying to a buffer for a contigous loop. This is also the case for our vectorized basic math functions (+-*/).
Since we merged vectorized exp and log we have vectorized ufuncs that are heavily cpu bound instead of memory bound.
For these functions it can be very worthwhile to use buffered iterators as only then the vectorized code can be used.
Another advantage is that it we guarantee the same results regardless of strides of the input data. (A property that in current master might be lost but could also be restored without a buffered iterator by using the vector code on scalar data though that might be a bit slower)
Btw vectorizing the functions for strided data adds complexity and would likely be slower than our buffer approach. I would not recommend pursuing it.
The text was updated successfully, but these errors were encountered: