-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[Bug]: Avoid generating font cache on import time #28485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I stand my comment from 2016 (#7592 (comment)) and we need this font cache to make (almost) any reasonable plot. If we find an existing cache on import we use it rather than re-generating. Ensuring that we know where to find fonts has always been the behavior on all platforms, what has changed on macos is that we now ask the system to tell us what fonts it has (rather than trying to explore the disk) via #27230 . Given that this is failing, I think the issue is that nix is being too restrictive about what other programs are available and its packaging of Matplotlib needs to be updated to include access to I am going to close this because I do not think we should take any action because in most cases this is a one-time-ever operation and delaying the expense only helps in the case where a) you are in a context where the cache does not persist b) we are imported but not actually used. The case of non-persistent cache can be handle by pre-generating it in the build process e.g. for containers. In this case delaying the generation would mask a show-stopping issue until later that I would rather be caught on import. |
This may be a common case for test pipelines, where an environment is built from scratch. Typically, other packages will import us globally, but more often than not you don't test plotting functionality. Just a guess - would need confirmation before considering any action. |
I am a Nixpkgs maintainer actually. Indeed, as mentioned in the previous post, it is getting increasingly common to run tests and analysis in such pipeline. Now, whenever a package imports matplotlib eagerly it breaks. This is the main issue here, something that could easily be avoided by letting matplotlib listening to some variable. |
I do not see how eagerly or not matters here and from this report I would expect any import of Matplotlib (intentional or otherwise) is going to fail under the nix packaging? It may be the case that this is being expressed is when the import is extraneous, but this still looks like fundementally like a nix packaging issue.
That maybe true, but we have no way to tell when we are being imported if we are going to be actually used in the process or not and I do not think we should make any comprises / have impacts on users who are actually using us. Maybe we should go all-in on lazy imports (which will spread out the import times) but if we do that it should be library wide not just for fonts. For fontconfig fonts we use a try-except that fails gracefully if the subprocess fails, we probably should be doing the same with matplotlib/lib/matplotlib/font_manager.py Lines 252 to 271 in f5067c1
|
Alternatively to lazy imports, we could also load the fontManager / font cache lazily. In a quick test, assigning matplotlib/lib/matplotlib/font_manager.py Lines 1585 to 1587 in f5067c1
This means, we currently load the fontManager at import time, but don't use it. We thus could refactor FontManager to do the work only on first use instead of creation. |
Our test harness broke on Mac in the last week because of all these |
#28498 was opened (possibly without knowing about this issue?) that will keep from failing on import.
We finally get mpl 3.9.0 built on conda-forge yesterday (https://anaconda.org/conda-forge/matplotlib-base/files). I am a bit more concerned about this happening with conda packaging. @lindsayad Do you do modify PATH as part of your CI (per #28498 (comment))?
I remain skeptical of the cost/benefit ratio of being lazy about building the font manager. The only thing we are explicitly lazy about now is importing GUI toolkits (which we do because that can be a mutually-exclusive and irreversible action), once #28498 is merged this is only slight shifting paying the cost to generate the cache around, it is a cost you only pay the first time you import mpl on a given system (assuming configdir is writable), and it is possible to pre-generate the cache (e.g. in containers). That said, I won't block a PR to do this. |
We don't. However, all of our mac CI runs are executed within the mac sandbox to comply with some zero-trust requirements. This includes not bind-mounting just about everything in With this, I think #28498 is a sufficient solution for our problem. We may end up making |
The main effect of the issue reported here has been resolved with #28498 (by another Nixpkgs maintainer btw). Thanks for the discussion here. |
I would also like the ability to turn off this cache generation, ideally through an env var. My situation is that I'm running an API service with some ML models which use Yes, I can pre-populate the cache at service build time. But it would be nice if everyone in similar scenarios didn't have to deal with this prepopulation step. The env var could have a big warning "USE AT OWN RISK, THIS WILL BREAK MATPLOTLIB" or something. |
@dvgica Can you comment on #28488 (comment) ? |
Bug summary
Matplotlib generates as a side-effect a font cache during import time. This is needed for plotting functionality to work. Unfortunately, this also means that any (transitive) dependency that eagerly imports matplotlib will now result in the cache to be generated, even when the user is at no time using matplotlib.
I opened an issue about this 8 years ago #7592. At the time the issue was regarding Linux. Now, it seems the behaviour has changed on MacOS and it happens there as well.
Code for reproduction
Actual outcome
Expected outcome
Preferably no side-effect during import-time. Or, as a work-around, some option such as an environment variable to disable the side-effect.
Additional information
Obviously if packages would not import matplotlib eagerly this would not be an issue. However, as a end-user this cannot be controlled. Anywhere in the dependency-tree one could import a library that is not required per se. But that does not mean libraries should just do side-effects during import time. Having an environment variable to disable the cache building would already be great, as it would provide a way out, other than trying to patch a (transitive) dependency.
Operating system
OSX
Matplotlib Version
3.9.0
Matplotlib Backend
No response
Python version
3.12
Jupyter version
No response
Installation
None
The text was updated successfully, but these errors were encountered: