-
-
Notifications
You must be signed in to change notification settings - Fork 87
Description
This is a mix of an issue and a feature request. I have noticed that calling .bit_length() is significantly slower in PyPy than any other basic operation on an integer. For example, this benchmark
#z = 0
z = 2**62
def main():
y = 0
for x in range(z, z + 10**9):
y += x.bit_length()
print(y)
main()takes 16 s to run locally when z=2**62, and 9 s when z=0. That is about 30 times slower than any other basic operation such as y += x % 10 or y += min(x, 15).
By implementing my own bit length function, I can make it run in about 0.9 s for z=2**62
table = [0] * 2**16
for i in range(1, len(table)):
table[i] = 1 + table[i//2]
def bitlength(x):
count = 0
if x >= 2**32:
x >>= 32
count += 32
if x >= 2**16:
x >>= 16
count += 16
return count + table[x]
z = 2**62
def main():
y = 0
for x in range(z, z + 10**9):
y += bitlength(x)
print(y)
main()Would it be possible to speed up the built in bit_length()? If I understand PyPy's source code correctly, it uses a very naive and slow algorithm to compute the bit length
pypy/pypy/objspace/std/intobject.py
Lines 512 to 522 in 16d42fc
| @jit.elidable | |
| def _bit_length(val): | |
| bits = 0 | |
| if val < 0: | |
| # warning, "-val" overflows here | |
| val = -((val + 1) >> 1) | |
| bits = 1 | |
| while val: | |
| bits += 1 | |
| val >>= 1 | |
| return bits |
val.bit_length() takes O(log(val)) time. Ideally one would use something like __builtin_clzl for it to run instant. But even something as simple as what CPython does would be a significant improvement.
Edit: While on the topic. Seems that bit_count is suffering from the same exact issue.
pypy/pypy/objspace/std/intobject.py
Lines 525 to 535 in 16d42fc
| @jit.elidable | |
| def _bit_count(val): | |
| if val == -sys.maxint - 1: | |
| return 1 | |
| elif val < 0: | |
| val = -val | |
| count = 0 | |
| while val: | |
| count += val & 1 | |
| val >>= 1 | |
| return count |