Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Font 42 kerning #20615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 30 additions & 15 deletions lib/matplotlib/backends/backend_pdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -2236,6 +2236,20 @@ def encode_string(self, s, fonttype):
return s.encode('cp1252', 'replace')
return s.encode('utf-16be', 'replace')

@staticmethod
def _font_supports_char(fonttype, char):
"""
Returns True if the font is able to provided the char in a PDF

For a Type 3 font, this method returns True only for single-byte
chars. For Type 42 fonts this method always returns True.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be explicit in the code and should we raise for other fonttypes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have rcParam checks that disallow other fonttypes..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've learned to be catious. If somebody implements a new fonttype, they may overlook this function. So, forcing future font types to make an explicit decision here is a plus (Unless you say that the default should generally be True and the Type-3 handling is a rare exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the function is only called in case of a Type 3 or 42, but I agree it's probably safer to raise a NotImplementedError (?) if the font type is neither.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could imagine an object-oriented solution where the different types of fonts are different classes, and this would be a method on the font. But that would likely be unnecessary overengineering. Perhaps the NotImplementedError is sufficient defence against future accidents.

"""
if fonttype == 3:
return ord(char) <= 255
if fonttype == 42:
return True
raise NotImplementedError()

def draw_text(self, gc, x, y, s, prop, angle, ismath=False, mtext=None):
# docstring inherited

Expand Down Expand Up @@ -2270,26 +2284,27 @@ def draw_text(self, gc, x, y, s, prop, angle, ismath=False, mtext=None):
}
self.file._annotations[-1][1].append(link_annotation)

# If fonttype != 3 emit the whole string at once without manual
# kerning.
if fonttype != 3:
# If fonttype is neither 3 nor 42, emit the whole string at once
# without manual kerning.
if fonttype not in [3, 42]:
self.file.output(Op.begin_text,
self.file.fontName(prop), fontsize, Op.selectfont)
self._setup_textpos(x, y, angle)
self.file.output(self.encode_string(s, fonttype),
Op.show, Op.end_text)

# There is no way to access multibyte characters of Type 3 fonts, as
# they cannot have a CIDMap. Therefore, in this case we break the
# string into chunks, where each chunk contains either a string of
# consecutive 1-byte characters or a single multibyte character.
# A sequence of 1-byte characters is broken into multiple chunks to
# adjust the kerning between adjacent chunks. Each chunk is emitted
# with a separate command: 1-byte characters use the regular text show
# command (TJ) with appropriate kerning between chunks, whereas
# multibyte characters use the XObject command (Do). (If using Type
# 42 fonts, all of this complication is avoided, but of course,
# subsetting those fonts is complex/hard to implement.)
# A sequence of characters is broken into multiple chunks. The chunking
# serves two purposes:
# - For Type 3 fonts, there is no way to access multibyte characters,
# as they cannot have a CIDMap. Therefore, in this case we break
# the string into chunks, where each chunk contains either a string
# of consecutive 1-byte characters or a single multibyte character.
# - A sequence of 1-byte characters is split into chunks to allow for
# kerning adjustments between consecutive chunks.
#
# Each chunk is emitted with a separate command: 1-byte characters use
# the regular text show command (TJ) with appropriate kerning between
# chunks, whereas multibyte characters use the XObject command (Do).
else:
# List of (start_x, [prev_kern, char, char, ...]), w/o zero kerns.
singlebyte_chunks = []
Expand All @@ -2298,7 +2313,7 @@ def draw_text(self, gc, x, y, s, prop, angle, ismath=False, mtext=None):
prev_was_multibyte = True
for item in _text_helpers.layout(
s, font, kern_mode=KERNING_UNFITTED):
if ord(item.char) <= 255:
if self._font_supports_char(fonttype, item.char):
if prev_was_multibyte:
singlebyte_chunks.append((item.x, []))
if item.prev_kern:
Expand Down
Binary file not shown.
7 changes: 7 additions & 0 deletions lib/matplotlib/tests/test_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -741,3 +741,10 @@ def test_parse_math():
ax.text(0, 0, r"$ \wrong{math} $", parse_math=True)
with pytest.raises(ValueError, match='Unknown symbol'):
fig.canvas.draw()


@image_comparison(['text_pdf_font42_kerning.pdf'], style='mpl20')
def test_pdf_font42_kerning():
plt.rcParams['pdf.fonttype'] = 42
plt.figure()
plt.figtext(0.1, 0.5, "ATAVATAVATAVATAVATA", size=30)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only kerning you were fixing? If not, maybe some of the other kerning pairs would be useful here rather than repeating the same one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean pairings of glyphs? I'd hope the font defines kerning correction for more pairs. I chose this string because there the effect is obvious even without a pixel-by-pixel comparison. For this specific issue, the test targets the presence of any kerning. However, I will modify the string in the next iteration.

Or, are you more worried about single-byte vs multi-byte vs beyond-BMP character pairings?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if you are sure this tests for any kerning and we don't need to test other kerning pairs, thats fine. (I have no idea what the different pairings are technically ;-))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kerns come from _text_helpers.layout which basically just calls font.get_kerning(prev_glyph_idx, glyph_idx, kern_mode) repeatedly. I'm convinced by this test - if the AV spacing was bad previously and good after this change, it means that kerns are now getting inserted where they weren't previously. Multibyte characters could plausibly still be missing kerns, but if so, I think it's fine to call it a separate issue.