FIX: move the font lock higher up the call and class tree #26302

tacaswell · 2023-07-13T16:55:57Z

PR summary

We have for a long time (~2012) had an RLock on RenderAgg that was used in FigureCanvasAgg.draw to protect the shared cache of ft2font objects (the underlying c++ is very stateful and not thread safe). This lock was also implicitly protecting the mathtext cache when using the Agg backend.

However, given the recent improvements to the layout code there are now ways to call call Figure.draw(renderer) without acquiring this lock which leads to exceptions when using mathtext and a layout manager to save independent figures on different threads.

This bug also exists for the other backendends which use both the mathtext parser and the ft2font cache in the case of rendering texts as paths (in the vector backends).

The fix is to:

pull the lock up to RendererBase so all renderer instances share a single lock
acquire the lock in Figure.draw which is always the top entry point to rendering a Figure

Closes #26289

Not sure how to test this. I guess we could add a test like:

from matplotlib.figure import Figure

import io
import threading


def test_crash2():
    for i in range(100):
        fig = Figure(tight_layout=True)
        # fig = Figure()

        ax = fig.add_subplot(1, 1, 1)

        ax.text(0, 0, "test $\pm$ 1.2")  # This crashes
        # ax.text(0, 0, 'test 1.2') # This does not

        buf = io.BytesIO()
        fig.savefig(buf, format="svg")


if True:
    threads = [threading.Thread(target=test_crash2) for _ in range(10)]
    [t.start() for t in threads]
    [t.join() for t in threads]

but it seems a bit of a dice roll if you put in enough iterations to feel safe you never lost the race but few enough it does not take too long.

If this is an acceptable change, I think we should also add a section to the docs clarifying what level of threading we think we support and what we do not (one thread per figure should work in my opinion).

PR checklist

"closes #0000" is in the body of the PR description to link the related issue
new and changed code is tested
[n/a] Plotting related features are demonstrated in an example
[n/a] New Features and API Changes are noted with a directive and release note
Documentation complies with general and docstring guidelines

anntzer · 2023-07-13T21:40:44Z

lib/matplotlib/figure.py

-        # draw the figure bounding box, perhaps none for white figure
-        if not self.get_visible():
-            return
+        with getattr(renderer, "lock", nullcontext()):


As argued elsewhere, I think we should either just document that third-party renderers must inherit RendererBase (and thus skip the getattr here), or at least add a check to warn that renderers with no lock attribute are deprecated. (... If I understand the point of the getattr correctly.)

That is the point.

I'm thinking about the best place to put that check.

I mean, the other option is we stick the lock on Figure on on some module.

The other analogy I know of a global-ish lock like this is the global lock in h5py to protect calling into libhdf5: https://github.com/h5py/h5py/blob/6b5af4c6495bf865fee5f036122187c21fcb17d4/h5py/_objects.pyx#L40-L46 .

Expecting the instance to carry the lock like this is "nice" in that things seem well encapsulated, but is leaves us open to some backend opting out (or using a different lock) and bringing back a super subtle version of this bug ("this only happens when I save a mix of svg and png in a multi-threaded environment....").

I've talked my self into making this a private class level attribute on Figure.

We can make it public later if we need to.

Sure, on the figure seems fine, too.

tacaswell · 2023-07-14T02:38:11Z

I also added a test for one of the changed lines that was not previously covered.

There are obviously other ways to get at the caches (I suspect if you ask a Text object how big it is outside of draw / draw_without_rendering you can still get your self in trouble) but that was already the case and this at least gets the "standard" paths covered.

tacaswell · 2023-07-14T16:15:04Z

All of the failures are codecov uploads failing.

greglucas · 2023-07-14T22:38:32Z

lib/matplotlib/figure.py

+            if not self.get_visible():
+                return


Leave this outside the lock to save acquiring the lock and releasing right away?

lib/matplotlib/figure.py

We have for a long time (~2012) had an RLock on `RenderAgg` that was used in `FigureCanvasAgg.draw` to protect the shared cache of ft2font objects (the underlying c++ is very stateful and not thread safe). This lock was also implicitly protecting the mathtext cache when using the Agg backend. However, given the recent improvements to the layout code there are now ways to call call `Figure.draw(renderer)` without acquiring this lock which leads to exceptions when using mathtext and a layout manager to save independent figures on different threads. This bug also exists for the other backendends which use both the mathtext parser and the ft2font cache in the case of rendering texts as paths (in the vector backends). The fix is to: - pull the lock up to `Figure` so all renderer instances effectively share a single lock - acquire the lock in `Figure.draw` which is always the top entry point to rendering a Figure. Closes matplotlib#26289 Co-authored-by: Greg Lucas <[email protected]>

tacaswell added this to the v3.8.0 milestone Jul 13, 2023

anntzer reviewed Jul 13, 2023

View reviewed changes

tacaswell force-pushed the fix/tightlayout_threading branch from b7ec8e9 to 2c1ecd8 Compare July 14, 2023 02:35

greglucas approved these changes Jul 14, 2023

View reviewed changes

tacaswell and others added 2 commits July 14, 2023 19:20

TST: add coverage for case when the figure is not visible

339dcb1

tacaswell force-pushed the fix/tightlayout_threading branch from 4dbff5e to 339dcb1 Compare July 14, 2023 23:20

ksunden approved these changes Jul 15, 2023

View reviewed changes

ksunden merged commit 7499015 into matplotlib:main Jul 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX: move the font lock higher up the call and class tree #26302

FIX: move the font lock higher up the call and class tree #26302

Uh oh!

tacaswell commented Jul 13, 2023

Uh oh!

anntzer Jul 13, 2023 •

edited

Loading

Uh oh!

tacaswell Jul 13, 2023

Uh oh!

tacaswell Jul 14, 2023 •

edited

Loading

Uh oh!

anntzer Jul 14, 2023

Uh oh!

tacaswell commented Jul 14, 2023 •

edited

Loading

Uh oh!

tacaswell commented Jul 14, 2023

Uh oh!

greglucas Jul 14, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FIX: move the font lock higher up the call and class tree #26302

FIX: move the font lock higher up the call and class tree #26302

Uh oh!

Conversation

tacaswell commented Jul 13, 2023

PR summary

PR checklist

Uh oh!

anntzer Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tacaswell Jul 13, 2023

Choose a reason for hiding this comment

Uh oh!

tacaswell Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anntzer Jul 14, 2023

Choose a reason for hiding this comment

Uh oh!

tacaswell commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tacaswell commented Jul 14, 2023

Uh oh!

greglucas Jul 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

anntzer Jul 13, 2023 •

edited

Loading

tacaswell Jul 14, 2023 •

edited

Loading

tacaswell commented Jul 14, 2023 •

edited

Loading