Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Populate the initial per-interpreter interned_strings dict with runtime-global singleton strings #103571

Closed as not planned
@Christopher-Chianelli

Description

@Christopher-Chianelli

Feature or enhancement

Re-use the runtime-global singleton strings inside the interned_strings dict to reduce duplication of
singleton strings (improving performance where such strings are used).

Pitch

bpo-46430 (#30683) caused an interesting side effect; the code
x = 'a'; x[0] is x no longer returned True. This in turn
is because there are two different cached versions of 'a':

  • One that was cached when code in frozen modules was compiled
    (and is stored in the interned_dict)
  • One that is stored as a runtime-global object that is used
    during function calls (and is stored in _Py_SINGLETON(strings))

However, some characters do not have this behaviour (for example,
'g', 'u', and 'z'). I suspect it because these characters are not
used in co_consts of frozen modules.

The interned_dict is per interpreter, and is initialized by
init_interned_dict(PyInterpreterState *). Currently, it is
initialized to an empty dict, which allows code in frozen modules
to use their (different and per interpreter) singleton strings
instead of the runtime-global one.

Using the synthetic test case:

def test():
    total = 0
    for ch in 'abc':
        if ch in {'a', 'c'}:
            total += 1
    return total

timeit.timeit(`test()`, globals=globals())

I get a ~5.43% improvement when the interned_strings dict reuses the runtime-global singleton strings.

Previous discussion

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions