Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 41bd022

Browse files
committed
SF bug #942952: Weakness in tuple hash
(Basic approach and test concept by Tim Peters.) * Improved the hash to reduce collisions. * Added the torture test to the test suite.
1 parent 504239f commit 41bd022

3 files changed

Lines changed: 25 additions & 2 deletions

File tree

Lib/test/test_tuple.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,25 @@ def f():
4141
yield i
4242
self.assertEqual(list(tuple(f())), range(1000))
4343

44+
def test_hash(self):
45+
# See SF bug 942952: Weakness in tuple hash
46+
# The hash should:
47+
# be non-commutative
48+
# should spread-out closely spaced values
49+
# should not exhibit cancellation in tuples like (x,(x,y))
50+
# should be distinct from element hashes: hash(x)!=hash((x,))
51+
# This test exercises those cases.
52+
# For a pure random hash and N=50, the expected number of collisions
53+
# is 7.3. Here we allow twice that number.
54+
# Any worse and the hash function is sorely suspect.
55+
56+
N=50
57+
base = range(N)
58+
xp = [(i, j) for i in base for j in base]
59+
inps = base + [(i, j) for i in base for j in xp] + \
60+
[(i, j) for i in xp for j in base] + xp + zip(base)
61+
collisions = len(inps) - len(set(map(hash, inps)))
62+
self.assert_(collisions <= 15)
4463

4564
def test_main():
4665
test_support.run_unittest(TupleTest)

Misc/NEWS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ What's New in Python 2.4 alpha 1?
1212
Core and builtins
1313
-----------------
1414

15+
- Improved the tuple hashing algorithm to give fewer collisions in
16+
common cases. Fixes bug #942952.
17+
1518
- Implemented generator expressions (PEP 289). Coded by Jiwon Seo.
1619

1720
- Enabled the profiling of C extension functions (and builtins) - check

Objects/tupleobject.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -262,15 +262,16 @@ tuplehash(PyTupleObject *v)
262262
register long x, y;
263263
register int len = v->ob_size;
264264
register PyObject **p;
265+
long mult = 1000003L;
265266
x = 0x345678L;
266267
p = v->ob_item;
267268
while (--len >= 0) {
268269
y = PyObject_Hash(*p++);
269270
if (y == -1)
270271
return -1;
271-
x = (1000003*x) ^ y;
272+
x = (x ^ y) * mult;
273+
mult += 69068L + len + len;
272274
}
273-
x ^= v->ob_size;
274275
if (x == -1)
275276
x = -2;
276277
return x;

0 commit comments

Comments
 (0)