Globally cache single TexManager instances. #13113

anntzer · 2019-01-05T22:44:41Z

This allows sharing its caches across renderer instances.

(If it was up to me this classes would be replaced by module-level
functions and a module-level cache, but heh.)

PR Summary

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

timhoffm · 2019-01-06T11:05:42Z

TexManager carries some state, e.g. font_family. Do we get into inconsistencies when changing e.g. rcParams['font.family'] and then reusing the existing instance?

anntzer · 2019-01-06T12:25:57Z

TexManager.get_font_config has logic to reinitialize itself when rcParams change, so we're safe.

timhoffm · 2019-01-06T13:46:25Z

Hm, losts of magic.

This would be ok then, but hard to understand. What's the gain of having just a single instance and is it worth the complexity?

anntzer · 2019-01-06T13:53:22Z

texmanager caches its results (see rgba_arrayd, grey_arrayd) but right now this caching is only used for a single renderer; the next renderer call (e.g. if one saves to pdf or svg) needs to redo all that computation. Always returning the same instance allows reusing that cache.

I initially thought that textpath did the same, but apparently it doesn't, so I guess we don't need to do that; undoing that part.

lib/matplotlib/textpath.py

tacaswell · 2019-01-06T23:03:08Z

Hm, losts of magic.

There are two hard problems in programming: naming things, cache invalidation, and off-by-one bugs.

jklymak · 2019-01-15T17:26:01Z

This seems to do what you say. How did you test that it does what you say and that there are no adverse consequences?

anntzer · 2019-01-15T21:31:28Z

Previously each canvas would create its own texmanager instance when it tried to invoke usetex, but now they all use the same instance (well, for that part you can check that the lru_cache pattern is used elsewhere in the codebase for the same purpose).
The fact that the tests still pass (and the tests test usetex more than once...) shows that there are no adverse effects.

timhoffm · 2019-01-16T05:37:43Z

So, the only advantage is saving computation time, when two renderes render the same figure (e.g. first to screen and then to PNG)?

If that‘s the case, I‘m -0.5 on this. The gain seems limited and the code logic gets more complex (also I‘m still not 100% sure that reusing the cache cannot lead to troubles. OTOH the code of this change is simple enough, so If others feel it‘s worth doing, Iöm fine with that.

anntzer · 2019-01-16T11:17:10Z

This would also allow not trying to cache TexManager instances at other places; e.g. right now contour handling code caches TexManager instances to figure out the size of tex strings (https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/contour.py#L269); with this PR we could just recreate TexManager on the fly when needed (as we'd always get the same instance).
In other words, the (minor, IMO) additional complexity here is paid back by simplifications elsewhere.

efiring · 2019-01-17T03:17:13Z

Logically, this should be a singleton, as far as I can see. I see only 2 caches. One is the on-disk long-term-persistent cache of pngs etc., which is already global, not instance-specific. The other is _rc_cache, which is just a tiny dictionary. Therefore I don't see how making this a singleton is actually saving much more than the time it takes to make the instance. Am I missing something? I'm inclined to merge it solely on the logical grounds that it should be a singleton; having more than one instance looks pointless. (And I agree that having it as a class is probably also unnecessary, but it looks harmless.)

anntzer · 2019-01-17T11:16:50Z

Oh, actually I missed the fact that rgba_arrayd and grey_arrayd are class-level caches so are shared across instances, so my argument is not really strong :p
I guess the main point is that we should not be bothering with e.g. the get_texmanager() methods on the backends and TextPath then...

jklymak · 2019-01-17T21:30:19Z

I'll hold off on merging pending further comment from @efiring

efiring · 2019-01-18T18:58:31Z

Maybe we can discuss this briefly on Monday. I'm wondering whether it would make more sense to just make a single instance in the texmanager module, and use that directly elsewhere. Perhaps the argument against that is the instantiation cost it incurs at startup, regardless of whether the TexManager instance is ever actually needed. In addition, it looks to me like it would fail with a RuntimeError on Google App Engine. It looks like someone started to put in a fix for that, but it wasn't completed. So most likely I am just misunderstanding.

Another question: in this PR, when the cached instance is returned by new, init is still called, isn't it? If so, we lose a little performance if we try to eliminate the get_texmanager methods. This would not be the case if just made a single instance in the texmanager module, and always used that directly.

To avoid the init overhead, wouldn't one need to move all of the init code into new, and make init a no-op?

anntzer · 2019-01-18T22:04:25Z

Maybe we can discuss this briefly on Monday. I'm wondering whether it would make more sense to just make a single instance in the texmanager module, and use that directly elsewhere. Perhaps the argument against that is the instantiation cost it incurs at startup, regardless of whether the TexManager instance is ever actually needed. In addition, it looks to me like it would fail with a RuntimeError on Google App Engine. It looks like someone started to put in a fix for that, but it wasn't completed. So most likely I am just misunderstanding.

GAE is basically not supported, see e.g. references in #8939. Will remove "pretense" of supporting GAE in texmanager in a separate PR (after this one is done).

Another question: in this PR, when the cached instance is returned by new, init is still called, isn't it? If so, we lose a little performance if we try to eliminate the get_texmanager methods. This would not be the case if just made a single instance in the texmanager module, and always used that directly.
To avoid the init overhead, wouldn't one need to move all of the init code into new, and make init a no-op?

Yes, my mistake, good catch. Fixed it.

Edit: except of course that TexManager does funky stuff like calling __init__ again from methods... fixed.

efiring

Just a minor change or clarification; otherwise, it looks fine.

lib/matplotlib/texmanager.py

This allows sharing its caches across renderer instances. (If it was up to me this class would be replaced by module-level functions and a module-level cache, but heh.)

anntzer force-pushed the singleton-texmanager-texttopath branch from 8cb06a8 to 4e98088 Compare January 6, 2019 13:54

anntzer changed the title ~~Globally cache single TexManager and TextToPath instances.~~ Globally cache single TexManager instances. Jan 6, 2019

QuLogic reviewed Jan 6, 2019

View reviewed changes

lib/matplotlib/textpath.py Outdated Show resolved Hide resolved

tacaswell added this to the v3.1 milestone Jan 6, 2019

anntzer force-pushed the singleton-texmanager-texttopath branch 2 times, most recently from d388dba to 2fa695f Compare January 7, 2019 00:01

timhoffm approved these changes Jan 16, 2019

View reviewed changes

jklymak approved these changes Jan 17, 2019

View reviewed changes

anntzer force-pushed the singleton-texmanager-texttopath branch from 2fa695f to e8677b6 Compare January 18, 2019 22:03

anntzer force-pushed the singleton-texmanager-texttopath branch from e8677b6 to 63b94ef Compare January 18, 2019 22:24

efiring requested changes Jan 19, 2019

View reviewed changes

lib/matplotlib/texmanager.py Outdated Show resolved Hide resolved

Globally cache a single TexManager instance.

5d82059

This allows sharing its caches across renderer instances. (If it was up to me this class would be replaced by module-level functions and a module-level cache, but heh.)

anntzer force-pushed the singleton-texmanager-texttopath branch from 63b94ef to 5d82059 Compare January 19, 2019 01:19

efiring approved these changes Jan 19, 2019

View reviewed changes

jklymak merged commit 3061c05 into matplotlib:master Jan 19, 2019

anntzer deleted the singleton-texmanager-texttopath branch January 19, 2019 10:23

Uh oh!

Globally cache single TexManager instances. #13113

Globally cache single TexManager instances. #13113

Uh oh!

Conversation

anntzer commented Jan 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

PR Checklist

Uh oh!

timhoffm commented Jan 6, 2019

Uh oh!

anntzer commented Jan 6, 2019

Uh oh!

timhoffm commented Jan 6, 2019

Uh oh!

anntzer commented Jan 6, 2019

Uh oh!

Uh oh!

tacaswell commented Jan 6, 2019

Uh oh!

jklymak commented Jan 15, 2019

Uh oh!

anntzer commented Jan 15, 2019

Uh oh!

timhoffm commented Jan 16, 2019

Uh oh!

anntzer commented Jan 16, 2019

Uh oh!

efiring commented Jan 17, 2019

Uh oh!

anntzer commented Jan 17, 2019

Uh oh!

jklymak commented Jan 17, 2019

Uh oh!

efiring commented Jan 18, 2019

Uh oh!

anntzer commented Jan 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

efiring left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

anntzer commented Jan 5, 2019 •

edited

Loading

anntzer commented Jan 18, 2019 •

edited

Loading