Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MutableMapping.update performance improvement #135575

Closed as not planned
Closed as not planned
@randolf-scholz

Description

@randolf-scholz

Feature or enhancement

Proposal:

There is an easy performance win available for the MutableMapping.update method by using .items() in the Mapping case. This is 7%-83% faster in microbenchmarks when updating with a dict.

def update(self, other=(), /, **kwds):
''' D.update([E, ]**F) -> None. Update D from mapping/iterable E and F.
If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k, v in F.items(): D[k] = v
'''
if isinstance(other, Mapping):
for key in other:
self[key] = other[key]

performance demo
from collections.abc import MutableMapping, Mapping
import time


def update_new(left, right):
    if isinstance(right, Mapping):
        for k, v in right.items():
            left[k] = v
        return
    raise RuntimeError


update_old = MutableMapping.update

d_str_small = {f"{i:010}": i for i in range(10)}
d_str_large = {f"{i:010}": i for i in range(1_000_000)}
d_int_small = {i: i for i in range(10)}
d_int_large = {i: i for i in range(1_000_000)}


def count_loops(fn, timeout=5):
    r"""Run `fn` until timeout is reached, counting the number of calls."""
    start = time.time()
    count = 0
    while time.time() - start < timeout:
        fn()
        count += 1
    return count


cases = [
    ("str_small, str_small", d_str_small, d_str_small),
    ("str_small, str_large", d_str_small, d_str_large),
    ("str_large, str_small", d_str_large, d_str_small),
    ("str_large, str_small", d_str_large, d_str_small),
    ("int_small, int_small", d_int_small, d_int_small),
    ("int_small, int_large", d_int_small, d_int_large),
    ("int_large, int_small", d_int_large, d_int_small),
    ("int_large, int_small", d_int_large, d_int_small),
]

results = []
for case, left, right in cases:
    n_new = count_loops(lambda: update_new(left, right))
    n_old = count_loops(lambda: update_old(left, right))
    improvement = (n_new - n_old) / n_old
    result = {
        "Case": case,
        "new": n_new,
        "old": n_old,
        "Improvement (%)": f"{improvement:.1f}",
    }
    results.append(result)
    print(f"{case=} {n_new=:10d}  {n_old=:10d}  {improvement=: 4.1%}", flush=True)

results on my machine

case='str_small, str_small' n_new=  14283733  n_old=  13351987  improvement= 7.0%
case='str_small, str_large' n_new=        78  n_old=        43  improvement= 81.4%
case='str_large, str_small' n_new=        80  n_old=        44  improvement= 81.8%
case='str_large, str_small' n_new=        79  n_old=        43  improvement= 83.7%
case='int_small, int_small' n_new=  14793573  n_old=  12938031  improvement= 14.3%
case='int_small, int_large' n_new=       269  n_old=       207  improvement= 30.0%
case='int_large, int_small' n_new=       271  n_old=       202  improvement= 34.2%
case='int_large, int_small' n_new=       273  n_old=       203  improvement= 34.5%

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

Labels

performancePerformance or resource usagetype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions