@@ -94,7 +94,7 @@ Tunable Dictionary Parameters
9494* Growth rate upon hitting maximum load. Currently set to *2.
9595 Raising this to *4 results in half the number of resizes,
9696 less effort to resize, better sparseness for some (but not
97- all dict sizes), and potentially double memory consumption
97+ all dict sizes), and potentially doubles memory consumption
9898 depending on the size of the dictionary. Setting to *4
9999 eliminates every other resize step.
100100
@@ -112,6 +112,8 @@ iteration and key listing. Those methods loop over every potential
112112entry. Doubling the size of dictionary results in twice as many
113113non-overlapping memory accesses for keys(), items(), values(),
114114__iter__(), iterkeys(), iteritems(), itervalues(), and update().
115+ Also, every dictionary iterates at least twice, once for the memset()
116+ when it is created and once by dealloc().
115117
116118
117119Results of Cache Locality Experiments
@@ -191,6 +193,8 @@ sizes and access patterns, the user may be able to provide useful hints.
191193 is not at a premium, the user may benefit from setting the maximum load
192194 ratio at 5% or 10% instead of the usual 66.7%. This will sharply
193195 curtail the number of collisions but will increase iteration time.
196+ The builtin namespace is a prime example of a dictionary that can
197+ benefit from being highly sparse.
194198
1951992) Dictionary creation time can be shortened in cases where the ultimate
196200 size of the dictionary is known in advance. The dictionary can be
@@ -199,7 +203,7 @@ sizes and access patterns, the user may be able to provide useful hints.
199203 more quickly because the first half of the keys will be inserted into
200204 a more sparse environment than before. The preconditions for this
201205 strategy arise whenever a dictionary is created from a key or item
202- sequence and the number of unique keys is known.
206+ sequence and the number of * unique* keys is known.
203207
2042083) If the key space is large and the access pattern is known to be random,
205209 then search strategies exploiting cache locality can be fruitful.
@@ -228,11 +232,12 @@ The dictionary can be immediately rebuilt (eliminating dummy entries),
228232resized (to an appropriate level of sparseness), and the keys can be
229233jostled (to minimize collisions). The lookdict() routine can then
230234eliminate the test for dummy entries (saving about 1/4 of the time
231- spend in the collision resolution loop).
235+ spent in the collision resolution loop).
232236
233237An additional possibility is to insert links into the empty spaces
234238so that dictionary iteration can proceed in len(d) steps instead of
235- (mp->mask + 1) steps.
239+ (mp->mask + 1) steps. Alternatively, a separate tuple of keys can be
240+ kept just for iteration.
236241
237242
238243Caching Lookups
0 commit comments