-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Simplify and unify character tracking in pdf and ps backends. #15320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
missing-references.json is still large. Probably this needs a rebase? |
rebased |
@@ -16,11 +16,43 @@ def _cached_get_afm_from_fname(fname): | |||
return AFM(fh) | |||
|
|||
|
|||
class CharacterTracker: | |||
def __init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see a bit more documentation on the class itself and its public methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo a minor formatting issue in the docstring.
Instead of trying to resolve font paths to absolute files and key off by inode(!), just track fonts using whatever names they use, and simplify used_characters to be a straight mapping of filenames to character ids (making the attribute private -- with a backcompat shim) at the same time). The previous approach would avoid embedding the same file twice if it is given under two different filenames (hardlinks to the same file...), but it would fail if the user passes a relative path, chdir()s to another directory, and passes another different font with the same filename, because of the lru_cache(). None of these seem likely to happen in practice, and in any case we can cover most of it by making the font paths absolute before passing them to FreeType (which is going to open the file anyways, so the cost of making them absolute doesn't matter).
I had a closer look at this implementation. I've tested the code in the scenario from #15629 (linked fonts with different file names, e.g. arial.ttf -> Arial.ttf). There is still one place where the real fonts are used: font_manager.get_font() In this case, all the characters are missing: |
Superseded by #15686. |
Instead of trying to resolve font paths to absolute files and key off by
inode(!), just track fonts using whatever names they use, and simplify
used_characters to be a straight mapping of filenames to character ids
(making the attribute private -- with a backcompat shim) at the same
time).
The previous approach would avoid embedding the same file twice if it is
given under two different filenames (hardlinks to the same file...), but
it would fail if the user passes a relative path, chdir()s to another
directory, and passes another different font with the same filename,
because of the lru_cache(). None of these seem likely to happen in
practice, and in any case we can cover most of it by making the font
paths absolute before passing them to FreeType (which is going to open
the file anyways, so the cost of making them absolute doesn't matter).
missing_references.json needs to be regenerated due to the missing reference to the private CharacterTracker class; to avoid changes in order in missing_references.json creating needless diffs, this goes on top of #15321
PR Summary
PR Checklist