Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit ce85acf

Browse files
committed
Merge: #20647: Update dictobject.c comments to account for randomized string hashes.
2 parents 20bd3b0 + 537ad7a commit ce85acf

1 file changed

Lines changed: 5 additions & 8 deletions

File tree

Objects/dictobject.c

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -88,20 +88,17 @@ it's USABLE_FRACTION (currently two-thirds) full.
8888
/*
8989
Major subtleties ahead: Most hash schemes depend on having a "good" hash
9090
function, in the sense of simulating randomness. Python doesn't: its most
91-
important hash functions (for strings and ints) are very regular in common
91+
important hash functions (for ints) are very regular in common
9292
cases:
9393
94-
>>> map(hash, (0, 1, 2, 3))
94+
>>>[hash(i) for i in range(4)]
9595
[0, 1, 2, 3]
96-
>>> map(hash, ("namea", "nameb", "namec", "named"))
97-
[-1658398457, -1658398460, -1658398459, -1658398462]
98-
>>>
9996
10097
This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
10198
the low-order i bits as the initial table index is extremely fast, and there
102-
are no collisions at all for dicts indexed by a contiguous range of ints.
103-
The same is approximately true when keys are "consecutive" strings. So this
104-
gives better-than-random behavior in common cases, and that's very desirable.
99+
are no collisions at all for dicts indexed by a contiguous range of ints. So
100+
this gives better-than-random behavior in common cases, and that's very
101+
desirable.
105102
106103
OTOH, when collisions occur, the tendency to fill contiguous slices of the
107104
hash table makes a good collision resolution strategy crucial. Taking only

0 commit comments

Comments
 (0)