Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Optimize ABC caches #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 10, 2017
Merged

Conversation

ilevkivskyi
Copy link
Member

Extensive use of isinstance with ABCs causes significant increase in memory consumption by generics, see https://mail.python.org/pipermail/python-dev/2017-January/147194.html

Here I make _abc_cache and friends shortcut to the type's __extra__ or _gorg (using descriptors where necessary). This allows to reduce memory consumption up to 4x on a micro-benchmark simulating conditions described in the python-dev thread. (There is of course a speed penalty, but it seems to be minor).

@methane Please take a look.

@ilevkivskyi
Copy link
Member Author

@methane I also inlined internal helpers to speed-up isinstance() up to 2x.

@methane
Copy link
Member

methane commented Feb 2, 2017

Because abc is full of magic, I can't say LGTM.
But basic idea seems very right. Thanks to fix it.
I'll check startup time and memory consumption.

BTW, this issue is new from 3.5.3 and 3.6.0?
Does Python 3.5.2 have less ABCs when using lot's of typing.List[X] for hundreds of classes?

self._abc_cache = self.__extra__._abc_cache
elif self.__origin__ is not None:
self._abc_registry = self.__origin__._abc_registry
self._abc_cache = self.__origin__._abc_cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about sharing self._abc_negative_cache too?

Copy link
Member Author

@ilevkivskyi ilevkivskyi Feb 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._abc_negative_cache is also shared, but it could be overwritten by abc.py on negative cache invalidation, this is why I made it a descriptor 7 lines below. (Positive cache is never invalidated, so that the sharing could be realised as a simple assignment on instantiation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracemalloc shows this.
I think removing _abc_negative_cache at __init__ or __new__ is worth enough.

125.34KiB / count=573
  File "/Users/inada-n/local/py37/lib/python3.7/_weakrefset.py", line 48
    self._iterating = set()
  File "/Users/inada-n/local/py37/lib/python3.7/abc.py", line 147
    cls._abc_negative_cache = WeakSet()
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 125
    return super().__new__(cls, name, bases, namespace)

123.44KiB / count=1193
  File "/Users/inada-n/local/py37/lib/python3.7/_weakrefset.py", line 38
    def _remove(item, selfref=ref(self)):
  File "/Users/inada-n/local/py37/lib/python3.7/abc.py", line 147
    cls._abc_negative_cache = WeakSet()
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 125
    return super().__new__(cls, name, bases, namespace)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not just remove them, since some other places expect their presence. However, I found a bug in _abc_negative_cache setters, and I made most assignments to _abc_negative_cache a no-op. Could you please check the memory/speed situation once more with the newest commit in my branch?

@ilevkivskyi
Copy link
Member Author

@methane

BTW, this issue is new from 3.5.3 and 3.6.0?

I think it was always like this. Actually, it was maybe even worse. There were some improvements in 3.5.3 and 3.6.0 due to generic type caches (these are unrelated caches that actually help :-))

@methane
Copy link
Member

methane commented Feb 2, 2017

Memory consumption goes wrong.

996.90KiB / count=3450
  File "/Users/inada-n/local/py37/lib/python3.7/abc.py", line 133
    cls = super().__new__(mcls, name, bases, namespace)
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 125
    return super().__new__(cls, name, bases, namespace)
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 963
    self = super().__new__(cls, name, bases, namespace, _root=True)
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 1126
    orig_bases=self.__orig_bases__)
  File "/Users/inada-n/local/py37/lib/python3.7/typing.py", line 642
    return cached(*args, **kwds)

@methane
Copy link
Member

methane commented Feb 2, 2017

I can reproduce above with original typing.py.

It wasn't in https://gist.github.com/methane/3c34f11fb677365a7e92afe73aca24e7
So it may caused by some internal mechanizm.

("", line 488 usually includes memory for
intern dict. Some internal cache affects tracemalloc.)

@ilevkivskyi
Copy link
Member Author

The last trace you posted (with 996.90KiB / count=3450) seems to be unrelated to _abc_caches. This is normal creation of generic classes. It could be a problem with tracemalloc. If necessary, you could clean-up generic type caches:

for c in typing._cleanups:
    c()

I don't know how to clean-up other internal caches.

Could you please compare the master version and my patch both in same conditions "from cold start"?


If you are interested, I am running a "micro-benchmark":

import sys
from typing import *
from collections.abc import Iterable

classes = []
for i in range(1000):
    classes.append(type('C'+str(i), (), {}))
    dummy = List[classes[i]]

for i in range(1000):
    dummy = isinstance(classes[i](), Iterable)

sys._debugmallocstats()

This gives me total memory consumption 5.5MB with my patch vs 48MB for typing/master.

@methane
Copy link
Member

methane commented Feb 2, 2017 via email

@ilevkivskyi
Copy link
Member Author

There are no regressions.

OK, thanks! Do you think the PR is satisfactory?
We could think about other possible optimizations later.

@methane
Copy link
Member

methane commented Feb 2, 2017

It looks nice to me.
But I'm not wizard of metaclass. I'm not good code reviewer of this module.

@ilevkivskyi
Copy link
Member Author

@methane
Thank you, it was very helpful to see the benchmark on a large code base.

@gvanrossum
Copy link
Member

I continue to lack the energy to review/approve this (I can handle only moderately complex reviews ATM). Given that the 3.6.1 candidate is expected in about two weeks (see PEP 494) maybe the best way forward is to merge this (and the coverage PR that depends on it) no, merge everything upstream, and hope for the best? The memory stats look encouraging for sure!

@ilevkivskyi
Copy link
Member Author

OK, let's do this. After merging, I will make additional checks (I do refleak checks from time to time, and run full CPython test suite after importing typing).

@ilevkivskyi ilevkivskyi merged commit 58c86ad into python:master Feb 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants