Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit e509b2a

Browse files
committed
Add notes on use cases with paired accesses to the same key.
1 parent e8b0f04 commit e509b2a

1 file changed

Lines changed: 30 additions & 5 deletions

File tree

Objects/dictnotes.txt

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,25 @@ Uniquification
2828
Dictionaries of any size. Bulk of work is in creation.
2929
Repeated writes to a smaller set of keys.
3030
Single read of each key.
31+
Some use cases have two consecutive accesses to the same key.
3132

3233
* Removing duplicates from a sequence.
3334
dict.fromkeys(seqn).keys()
35+
3436
* Counting elements in a sequence.
35-
for e in seqn: d[e]=d.get(e,0) + 1
36-
* Accumulating items in a dictionary of lists.
37-
for k, v in itemseqn: d.setdefault(k, []).append(v)
37+
for e in seqn:
38+
d[e] = d.get(e,0) + 1
39+
40+
* Accumulating references in a dictionary of lists:
41+
42+
for pagenumber, page in enumerate(pages):
43+
for word in page:
44+
d.setdefault(word, []).append(pagenumber)
45+
46+
Note, the second example is a use case characterized by a get and set
47+
to the same key. There are similar used cases with a __contains__
48+
followed by a get, set, or del to the same key. Part of the
49+
justification for d.setdefault is combining the two lookups into one.
3850

3951
Membership Testing
4052
Dictionaries of any size. Created once and then rarely changes.
@@ -44,7 +56,7 @@ Membership Testing
4456
such as with the % formatting operator.
4557

4658
Dynamic Mappings
47-
Characterized by deletions interspersed with adds and replacments.
59+
Characterized by deletions interspersed with adds and replacements.
4860
Performance benefits greatly from the re-use of dummy entries.
4961

5062

@@ -141,6 +153,9 @@ distribution), then there will be more benefit for large dictionaries
141153
because any given key is no more likely than another to already be
142154
in cache.
143155

156+
* In use cases with paired accesses to the same key, the second access
157+
is always in cache and gets no benefit from efforts to further improve
158+
cache locality.
144159

145160
Optimizing the Search of Small Dictionaries
146161
-------------------------------------------
@@ -184,7 +199,7 @@ sizes and access patterns, the user may be able to provide useful hints.
184199
more quickly because the first half of the keys will be inserted into
185200
a more sparse environment than before. The preconditions for this
186201
strategy arise whenever a dictionary is created from a key or item
187-
sequence of known length.
202+
sequence and the number of unique keys is known.
188203

189204
3) If the key space is large and the access pattern is known to be random,
190205
then search strategies exploiting cache locality can be fruitful.
@@ -218,3 +233,13 @@ spend in the collision resolution loop).
218233
An additional possibility is to insert links into the empty spaces
219234
so that dictionary iteration can proceed in len(d) steps instead of
220235
(mp->mask + 1) steps.
236+
237+
238+
Caching Lookups
239+
---------------
240+
The idea is to exploit key access patterns by anticipating future lookups
241+
based of previous lookups.
242+
243+
The simplest incarnation is to save the most recently accessed entry.
244+
This gives optimal performance for use cases where every get is followed
245+
by a set or del to the same key.

0 commit comments

Comments
 (0)