Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 15d4929

Browse files
committed
Implement an old idea of Christian Tismer's: use polynomial division
instead of multiplication to generate the probe sequence. The idea is recorded in Python-Dev for Dec 2000, but that version is prone to rare infinite loops. The value is in getting *all* the bits of the hash code to participate; and, e.g., this speeds up querying every key in a dict with keys [i << 16 for i in range(20000)] by a factor of 500. Should be equally valuable in any bad case where the high-order hash bits were getting ignored. Also wrote up some of the motivations behind Python's ever-more-subtle hash table strategy.
1 parent dac238b commit 15d4929

2 files changed

Lines changed: 80 additions & 18 deletions

File tree

Misc/NEWS

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,14 @@ Core
116116
to crash if the element comparison routines for the dict keys and/or
117117
values mutated the dicts. Making the code bulletproof slowed it down.
118118

119+
- Collisions in dicts now use polynomial division instead of multiplication
120+
to generate the probe sequence, following an idea of Christian Tismer's.
121+
This allows all bits of the hash code to come into play. It should have
122+
little or no effect on speed in ordinary cases, but can help dramatically
123+
in bad cases. For example, looking up every key in a dict d with
124+
d.keys() = [i << 16 for i in range(20000)] is approximately 500x faster
125+
now.
126+
119127
Library
120128

121129
- calendar.py uses month and day names based on the current locale.

Objects/dictobject.c

Lines changed: 72 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,58 @@ k a second time. Theory can be used to find such polys efficiently, but the
3131
operational defn. of "works" is sufficient to find them in reasonable time
3232
via brute force program (hint: any poly that has an even number of 1 bits
3333
cannot work; ditto any poly with low bit 0; exploit those).
34+
35+
Some major subtleties: Most hash schemes depend on having a "good" hash
36+
function, in the sense of simulating randomness. Python doesn't: some of
37+
its hash functions are trivial, such as hash(i) == i for ints i (excepting
38+
i == -1, because -1 is the "error occurred" return value from tp_hash).
39+
40+
This isn't necessarily bad! To the contrary, that our hash tables are powers
41+
of 2 in size, and that we take the low-order bits as the initial table index,
42+
means that there are no collisions at all for dicts indexed by a contiguous
43+
range of ints. This is "better than random" behavior, and that's very
44+
desirable.
45+
46+
On the other hand, when collisions occur, the tendency to fill contiguous
47+
slices of the hash table makes a good collision resolution strategy crucial;
48+
e.g., linear probing is right out.
49+
50+
Reimer Behrends contributed the idea of using a polynomial-based approach,
51+
using repeated multiplication by x in GF(2**n) where a polynomial is chosen
52+
such that x is a primitive root. This visits every table location exactly
53+
once, and the sequence of locations probed is highly non-linear.
54+
55+
The same is also largely true of quadratic probing for power-of-2 tables, of
56+
the specific
57+
58+
(i + comb(1, 2)) mod size
59+
(i + comb(2, 2)) mod size
60+
(i + comb(3, 2)) mod size
61+
(i + comb(4, 2)) mod size
62+
...
63+
(i + comb(j, 2)) mod size
64+
65+
flavor. The polynomial approach "scrambles" the probe indices better, but
66+
more importantly allows to get *some* additional bits of the hash code into
67+
play via computing the initial increment, thus giving a weak form of double
68+
hashing. Quadratic probing cannot be extended that way (the first probe
69+
offset must be 1, the second 3, the third 6, etc).
70+
71+
Christian Tismer later contributed the idea of using polynomial division
72+
instead of multiplication. The problem is that the multiplicative method
73+
can't get *all* the bits of the hash code into play without expensive
74+
computations that slow down the initial index and/or initial increment
75+
computation. For a set of keys like [i << 16 for i in range(20000)], under
76+
the multiplicative method the initial index and increment were the same for
77+
all keys, so every key followed exactly the same probe sequence, and so
78+
this degenerated into a (very slow) linear search. The division method uses
79+
all the bits of the hash code naturally in the increment, although it *may*
80+
visit locations more than once until such time as all the high bits of the
81+
increment have been shifted away. It's also impossible to tell in advance
82+
whether incr is congruent to 0 modulo poly, so each iteration of the loop has
83+
to guard against incr becoming 0. These are minor costs, as we usually don't
84+
get into the probe loop, and when we do we usually get out on its first
85+
iteration.
3486
*/
3587

3688
static long polys[] = {
@@ -204,7 +256,7 @@ static dictentry *
204256
lookdict(dictobject *mp, PyObject *key, register long hash)
205257
{
206258
register int i;
207-
register unsigned incr;
259+
register unsigned int incr;
208260
register dictentry *freeslot;
209261
register unsigned int mask = mp->ma_size-1;
210262
dictentry *ep0 = mp->ma_table;
@@ -244,13 +296,14 @@ lookdict(dictobject *mp, PyObject *key, register long hash)
244296
}
245297
/* Derive incr from hash, just to make it more arbitrary. Note that
246298
incr must not be 0, or we will get into an infinite loop.*/
247-
incr = (hash ^ ((unsigned long)hash >> 3)) & mask;
248-
if (!incr)
249-
incr = mask;
299+
incr = hash ^ ((unsigned long)hash >> 3);
300+
250301
/* In the loop, me_key == dummy is by far (factor of 100s) the
251302
least likely outcome, so test for that last. */
252303
for (;;) {
253-
ep = &ep0[(i+incr)&mask];
304+
if (!incr)
305+
incr = 1; /* and incr will never be 0 again */
306+
ep = &ep0[(i + incr) & mask];
254307
if (ep->me_key == NULL) {
255308
if (restore_error)
256309
PyErr_Restore(err_type, err_value, err_tb);
@@ -282,10 +335,10 @@ lookdict(dictobject *mp, PyObject *key, register long hash)
282335
}
283336
else if (ep->me_key == dummy && freeslot == NULL)
284337
freeslot = ep;
285-
/* Cycle through GF(2^n)-{0} */
286-
incr <<= 1;
287-
if (incr > mask)
288-
incr ^= mp->ma_poly; /* clears the highest bit */
338+
/* Cycle through GF(2**n). */
339+
if (incr & 1)
340+
incr ^= mp->ma_poly; /* clears the lowest bit */
341+
incr >>= 1;
289342
}
290343
}
291344

@@ -303,7 +356,7 @@ static dictentry *
303356
lookdict_string(dictobject *mp, PyObject *key, register long hash)
304357
{
305358
register int i;
306-
register unsigned incr;
359+
register unsigned int incr;
307360
register dictentry *freeslot;
308361
register unsigned int mask = mp->ma_size-1;
309362
dictentry *ep0 = mp->ma_table;
@@ -334,13 +387,14 @@ lookdict_string(dictobject *mp, PyObject *key, register long hash)
334387
}
335388
/* Derive incr from hash, just to make it more arbitrary. Note that
336389
incr must not be 0, or we will get into an infinite loop.*/
337-
incr = (hash ^ ((unsigned long)hash >> 3)) & mask;
338-
if (!incr)
339-
incr = mask;
390+
incr = hash ^ ((unsigned long)hash >> 3);
391+
340392
/* In the loop, me_key == dummy is by far (factor of 100s) the
341393
least likely outcome, so test for that last. */
342394
for (;;) {
343-
ep = &ep0[(i+incr)&mask];
395+
if (!incr)
396+
incr = 1; /* and incr will never be 0 again */
397+
ep = &ep0[(i + incr) & mask];
344398
if (ep->me_key == NULL)
345399
return freeslot == NULL ? ep : freeslot;
346400
if (ep->me_key == key
@@ -350,10 +404,10 @@ lookdict_string(dictobject *mp, PyObject *key, register long hash)
350404
return ep;
351405
if (ep->me_key == dummy && freeslot == NULL)
352406
freeslot = ep;
353-
/* Cycle through GF(2^n)-{0} */
354-
incr <<= 1;
355-
if (incr > mask)
356-
incr ^= mp->ma_poly; /* clears the highest bit */
407+
/* Cycle through GF(2**n). */
408+
if (incr & 1)
409+
incr ^= mp->ma_poly; /* clears the lowest bit */
410+
incr >>= 1;
357411
}
358412
}
359413

0 commit comments

Comments
 (0)