-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
font_manager.py takes multiple seconds to import #4756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's also worrying that just importing matplotlib.pyplot causes a read of a pickle file from the filesystem. This would make me nervous about using matplotlib in a security-sensitive environment. |
That is the font cache which is there to save even more time from having to search the system for fonts. |
He is right, though... if someone were to swap out that font cache pickle On Wed, Jul 22, 2015 at 10:55 AM, Thomas A Caswell <[email protected]
|
Would it be possible to use a less general serialization format (e.g. JSON) for font caching? |
I am also confused by that, I thought loading from a pickle side-stepped normal And yes, we probably should move to a (what I would have called a more general) serialization which stashes a bit more of the stuff that is being parsed out of the AFM files which will be a win on all accounts. |
I meant less general in the sense that matplotlib has to provide more information about how to actually serialize the objects in question, rather than relying on pickle to serialize arbitrary python objects. |
If a malicious user has access to the filesystem, then importing anything is a security concern...
You can also do:
There may well be a strong argument for a custom serialisation though - particularly if there is a standard form for font caching that we can make use of. |
@pelson I agree with this assessment for users who are running Python on their local machine and administering their own environments. I'd add, however, that one important distinction between writing to In particular, if you're deploying a server on the public internet that uses python, or if you're administering a shared machine with untrusted users, it might be reasonable security practice to enforce that your site-packages are read-only for users who are actually executing code, precisely for the reason you describe. |
Will mpl try to write the font cache to system level site-packages? mpl is deployed to google app engine and they run in a very locked down environment. I wonder how they deal with this. I think the two things that need to be done here:
Given the other issues we seem to have with the font cache not going smoothly across updates, this should probably be reasonably high priority for the next point release. |
No. Caches never go in Python source directories.
They just don't use a cache, but regenerate the font directory each time. This is less of an issue on GAE because very few fonts other than the built-in matplotlib ones are available. See #1824.
The issue there is that the font cache is user environment-specific. We can't really provide a read-only one unless we limit the set of fonts to the ones that ship with matplotlib.
It's a really simple data structure -- JSON would work just fine for this purpose. (The matplotlib font cache predates JSON itself, let alone its inclusion in the Python stdlib, so that wasn't an "easy" option at the time). It will probably still require some versioning, but JSON at least avoids most of the security concerns (outside of exploiting the occasional bugs in Freetype opening malformed font files). |
We are now warning about the regeneration and using a json cache. I think this can close |
This is brutal amount of latency when trying to provide a smooth user experience from an application that uses matplotlib internally.
The text was updated successfully, but these errors were encountered: