Bug report
Bug description:
Originally reported by @ThomasBr0 in #132762
The _Py_bit_length() and _BitScanReverse64() code paths are incorrect and don't always return the smallest log2 key size to fit the desired number of items. The bug is benign in the sense that the dictionaries are sometimes too big, but they're never too small.
For example calculate_log2_keysize(7) = 4 but calculate_log2_keysize(8) = 3.
|
/* Find the smallest dk_size >= minsize. */ |
|
static inline uint8_t |
|
calculate_log2_keysize(Py_ssize_t minsize) |
|
{ |
|
#if SIZEOF_LONG == SIZEOF_SIZE_T |
|
minsize = (minsize | PyDict_MINSIZE) - 1; |
|
return _Py_bit_length(minsize | (PyDict_MINSIZE-1)); |
|
#elif defined(_MSC_VER) |
|
// On 64bit Windows, sizeof(long) == 4. |
|
minsize = (minsize | PyDict_MINSIZE) - 1; |
|
unsigned long msb; |
|
_BitScanReverse64(&msb, (uint64_t)minsize); |
|
return (uint8_t)(msb + 1); |
|
#else |
|
uint8_t log2_size; |
|
for (log2_size = PyDict_LOG_MINSIZE; |
|
(((Py_ssize_t)1) << log2_size) < minsize; |
|
log2_size++) |
|
; |
|
return log2_size; |
|
#endif |
|
} |
cc @methane @markshannon
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
Bug report
Bug description:
Originally reported by @ThomasBr0 in #132762
The
_Py_bit_length()and_BitScanReverse64()code paths are incorrect and don't always return the smallest log2 key size to fit the desired number of items. The bug is benign in the sense that the dictionaries are sometimes too big, but they're never too small.For example
calculate_log2_keysize(7) = 4butcalculate_log2_keysize(8) = 3.cpython/Objects/dictobject.c
Lines 545 to 566 in 9546eee
cc @methane @markshannon
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs