Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Jul 19, 2025

PR summary

With libraqm, string layout produces glyph indices, not character codes, and font features may even produce different glyphs for the same character code (e.g., by picking a different Stylistic Set). Thus we cannot rely on character codes as unique items within a font, and must move toward glyph indices everywhere.

The only thing I don't quite like is that PDF uses character codes for its lookup, and I have to map glyph indices back through an inverse charmap. I think I may have to send everything through CharacterTracker and produce my own limited charmap, but still need to test out what's required. Better stuff for this is done in #30512.

This is based on #30143.

PR checklist

@QuLogic
Copy link
Member Author

QuLogic commented Sep 4, 2025

I've decided to restore the character code in the return values from mathtext, because I've found some use for it in PDF output.

@@ -2274,7 +2268,7 @@ def draw_tex(self, gc, x, y, s, prop, angle, *, mtext=None):
seq += [['font', pdfname, dvifont.size]]
oldfont = dvifont
seq += [['text', x1, y1, [bytes([glyph])], x1+width]]
self.file._character_tracker.track(dvifont, chr(glyph))
self.file._character_tracker.track_glyph(dvifont, glyph)
Copy link
Contributor

@anntzer anntzer Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to use text.index here? (with for text in page.text: x1, y1 dvifont, glyph, width = text; ...) (#29868)
I would even stop unpacking and just use text.x, text.y, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might mean #29829 here?

Copy link
Member Author

@QuLogic QuLogic Sep 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it looks like switching to text.index would require a bit more work, as the T1 font subsetter is working with characters too. I guess dbd689f would be the best place for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, you are correct. However this makes things a bit tricky to follow because this means that track_glyph effectively takes a glyph index as second argument if the font is a non-DVI font, but a charcode if the font is a DVI font, or more specifically, a type 1 font (because the type1 subsetter works with characters, as you mention). Is that correct? I guess that's OK as a temporary state because as you mention dbd689f will resolve that discrepancy, but this probably warrants a comment (that can later be dropped in dbd689f) to avoid puzzling the reader?
(Also, keeping this discrepancy would be problematic in the long term as lua/xelatex support will mean that this loop will also sometimes emit glyphs from TTF fonts, but I believe this will again be made clearer by dbd689f.)

@QuLogic QuLogic force-pushed the vector-glyphs branch 2 times, most recently from 41a5b7d to df7fa98 Compare September 13, 2025 10:53
With libraqm, string layout produces glyph indices, not character codes,
and font features may even produce different glyphs for the same
character code (e.g., by picking a different Stylistic Set). Thus we
cannot rely on character codes as unique items within a font, and must
move toward glyph indices everywhere.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Review
Development

Successfully merging this pull request may close these issues.

2 participants