From 8a01d906a4a0d12cc209982222564e395ec87ccb Mon Sep 17 00:00:00 2001 From: Aitik Gupta Date: Mon, 19 Jul 2021 13:40:54 +0530 Subject: [PATCH 1/5] Add high-level documentation --- doc/users/explain/fonts.rst | 55 +++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/doc/users/explain/fonts.rst b/doc/users/explain/fonts.rst index 8a69dca7feeb..53ae6b4f4bc2 100644 --- a/doc/users/explain/fonts.rst +++ b/doc/users/explain/fonts.rst @@ -130,3 +130,58 @@ This is especially helpful to generate *really lightweight* documents.:: free versions of the proprietary fonts. This also violates the *what-you-see-is-what-you-get* feature of Matplotlib. + +Are we reinventing the wheel? +----------------------------- +Internally, a feasible response to the question of 'reinventing the +wheel would be, well, Yes *and No*. The font-matching algorithm used +by Matplotlib has been *inspired* by web browsers, more specifically, +`CSS Specifications `_! + +Currently, the simplest way (and the only way) to tell Matplotlib what fonts +you want it to use for your document is via the **font.family** rcParam, +see :doc:`Customizing text properties `. + +This is similar to how one tells a browser to use multiple font families +(specified in their order of preference) for their HTML webpages. By using +**font-family** in their stylesheet, users can essentially trigger a very +useful feature provided by browers, known as Font-Fallback. For example, the +following snippet in an HTMl markup would: + +.. code-block:: html + + + + + + some text + + + +For every character/glyph in *"some text"*, the browser will iterate through +the whole list of font-families, and check whether that character/glyph is +available in that font-family. As soon as a font is found which has the +required glyph(s), the browser moves on to the next character. + +How does Matplotlib achieve this? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Well, Matplotlib doesn't achieve this, *yet*. It was initially only designed to +use a **single font** throughout the document, i.e., no matter how many +families you pass to **font.family** rcParam, Matplotlib would use the very +first font it's able to find on your system, and try to render all your +characters/glyphs from that *and only that* font. + +.. note:: + This is, because the internal font matching was written/adapted + from a very old `CSS1 spec `_, + **written in 1998**! + + However, allowing multiple fonts for a single document (also enabling + Font-Fallback) is one of the goals for 2021's Google Summer of Code project. + + `Read more on Matplotblog `_! + From 6f0d7f5356b74d6f8154c8bdaa6934596673d533 Mon Sep 17 00:00:00 2001 From: Aitik Gupta Date: Thu, 22 Jul 2021 22:12:22 +0530 Subject: [PATCH 2/5] Reword sentences to be more formal --- doc/users/explain/fonts.rst | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/users/explain/fonts.rst b/doc/users/explain/fonts.rst index 53ae6b4f4bc2..0a434ac6e23d 100644 --- a/doc/users/explain/fonts.rst +++ b/doc/users/explain/fonts.rst @@ -136,7 +136,7 @@ Are we reinventing the wheel? Internally, a feasible response to the question of 'reinventing the wheel would be, well, Yes *and No*. The font-matching algorithm used by Matplotlib has been *inspired* by web browsers, more specifically, -`CSS Specifications `_! +`CSS Specifications `_. Currently, the simplest way (and the only way) to tell Matplotlib what fonts you want it to use for your document is via the **font.family** rcParam, @@ -146,7 +146,7 @@ This is similar to how one tells a browser to use multiple font families (specified in their order of preference) for their HTML webpages. By using **font-family** in their stylesheet, users can essentially trigger a very useful feature provided by browers, known as Font-Fallback. For example, the -following snippet in an HTMl markup would: +following snippet in an HTML markup would: .. code-block:: html @@ -165,14 +165,15 @@ following snippet in an HTMl markup would: For every character/glyph in *"some text"*, the browser will iterate through the whole list of font-families, and check whether that character/glyph is available in that font-family. As soon as a font is found which has the -required glyph(s), the browser moves on to the next character. +required glyph(s), the browser uses that font to render that character, and +subsequently moves on to the next character. How does Matplotlib achieve this? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Well, Matplotlib doesn't achieve this, *yet*. It was initially only designed to -use a **single font** throughout the document, i.e., no matter how many -families you pass to **font.family** rcParam, Matplotlib would use the very -first font it's able to find on your system, and try to render all your +Currently, Matplotlib can't render a multi-font document. It was initially +only designed to use a **single font** throughout the document, i.e., no matter +how many families you pass to **font.family** rcParam, Matplotlib would use the +very first font it's able to find on your system, and try to render all your characters/glyphs from that *and only that* font. .. note:: From ec5d0b0d9f9c53569ba18f5bf1f8fd63cfe0ec97 Mon Sep 17 00:00:00 2001 From: Thomas A Caswell Date: Sun, 14 Aug 2022 19:19:54 -0400 Subject: [PATCH 3/5] DOC: update and reword fonts.rst Account for the fact that font fallback is now merged and re-organize slightly. --- doc/users/explain/fonts.rst | 264 ++++++++++++++++++----------------- tutorials/text/text_props.py | 2 + 2 files changed, 139 insertions(+), 127 deletions(-) diff --git a/doc/users/explain/fonts.rst b/doc/users/explain/fonts.rst index 0a434ac6e23d..7e57b520eaab 100644 --- a/doc/users/explain/fonts.rst +++ b/doc/users/explain/fonts.rst @@ -1,23 +1,27 @@ .. redirect-from:: /users/fonts -Fonts in Matplotlib text engine -=============================== +Fonts in Matplotlib +=================== Matplotlib needs fonts to work with its text engine, some of which are shipped -alongside the installation. However, users can configure the default fonts, or -even provide their own custom fonts! For more details, see :doc:`Customizing -text properties `. +alongside the installation. The default font is `DejaVu Sans +`_ which covers most European writing systems. +However, users can configure the default fonts, and provide their own custom +fonts. See :doc:`Customizing text properties ` for +details and :ref:`font-nonlatin` in particular for glyphs not supported by +DejaVu Sans. -However, Matplotlib also provides an option to offload text rendering to a TeX -engine (``usetex=True``), -see :doc:`Text rendering with LaTeX `. +Matplotlib also provides an option to offload text rendering to a TeX engine +(``usetex=True``), see :doc:`Text rendering with LaTeX +`. -Font specifications -------------------- -Fonts have a long and sometimes incompatible history in computing, leading to -different platforms supporting different types of fonts. In practice, there are -3 types of font specifications Matplotlib supports (in addition to 'core -fonts', more about which is explained later in the guide): +Fonts in PDF and postscript +--------------------------- + +Fonts have a long (and sometimes incompatible) history in computing, leading to +different platforms supporting different types of fonts. In practice, there +are 3 types of font specifications Matplotlib supports (in addition to 'core +fonts' in pdf which is explained later in the guide): .. list-table:: Type of Fonts :header-rows: 1 @@ -37,20 +41,19 @@ fonts', more about which is explained later in the guide): - Hinting supported (virtual machine processes the "hints") * - Non-subsetted through Matplotlib - Subsetted via external module `ttconv `_ - - Subsetted via external module `fonttools `_ + - Subsetted via external module `fonttools `__ NOTE: Adobe will disable support for authoring with Type 1 fonts in January 2023. `Read more here. `_ -Special mentions -^^^^^^^^^^^^^^^^ + Other font specifications which Matplotlib supports: - Type 42 fonts (PS): - PostScript wrapper around TrueType fonts - 42 is the `Answer to Life, the Universe, and Everything! `_ - - Matplotlib uses an external library called `fonttools `_ + - Matplotlib uses an external library called `fonttools `__ to subset these types of fonts - OpenType fonts: @@ -60,50 +63,37 @@ Other font specifications which Matplotlib supports: - Generally contain a much larger character set! - Limited Support with Matplotlib -Subsetting ----------- -Matplotlib is able to generate documents in multiple different formats. Some of -those formats (for example, PDF, PS/EPS, SVG) allow embedding font data in such -a way that when these documents are visually scaled, the text does not appear -pixelated. - -This can be achieved by embedding the *whole* font file within the -output document. However, this can lead to very large documents, as some -fonts (for instance, CJK - Chinese/Japanese/Korean fonts) can contain a large -number of glyphs, and thus their embedded size can be quite huge. - -Font Subsetting can be used before generating documents, to embed only the -*required* glyphs within the documents. Fonts can be considered as a collection -of glyphs, so ultimately the goal is to find out *which* glyphs are required -for a certain array of characters, and embed only those within the output. - -.. note:: - The role of subsetter really shines when we encounter characters like **ä** - (composed by calling subprograms for **a** and **¨**); since the subsetter - has to find out *all* such subprograms being called by every glyph included - in the subset, this is a generally difficult problem! - -Luckily, Matplotlib uses a fork of an external dependency called -`ttconv `_, which helps in embedding and -subsetting font data. (however, recent versions have moved away from ttconv to -pure Python for certain types: for more details visit -`these `_, `links `_) - -| *Type 1 fonts are still non-subsetted* through Matplotlib. (though one will encounter these mostly via *usetex*/*dviread* in PDF backend) -| **Type 3 and Type 42 fonts are subsetted**, with a fair amount of exceptions and bugs for the latter. - -What to use? ------------- -Practically, most fonts that are readily available on most operating systems or -are readily available on the internet to download include *TrueType fonts* and -its "extensions" such as MacOS-resource fork fonts and the newer OpenType -fonts. +Font Subsetting +~~~~~~~~~~~~~~~ + +PDF and postscript support embedded fonts in the output files allowing the +display program to correctly render the text, independent of what fonts are +installed on the viewer's computer, without the need to pre-rasterize the text. +This ensures that if the output is zoomed or resized the text does not become +pixelated. However, embedding full fonts in the file can lead to large output +files, particularly with fonts with many glyphs such as those that support CJK +(Chinese/Japanese/Korean). + +The solution to this problem is to subset the fonts used in the document and +only embed the glyphs actually used. This gets both vector text and small +files sizes. Computing the subset of the font required and writing the new +(reduced) font are both complex problem and thus Matplotlib relies on a +vendored fork of `ttconv `_ and `fontTools +`__. + +Currently Type 3, Type 42, and TrueType fonts are subseted. Type 1 fonts are not. + + +Core Fonts +~~~~~~~~~~ -PS and PDF backends provide support for yet another type of fonts, which remove -the need of subsetting altogether! These are called **Core Fonts**, and -Matplotlib calls them via the keyword **AFM**; all that is supplied from -Matplotlib to such documents are font metrics (specified in AFM format), and it -is the job of the viewer applications to supply the glyph definitions. +In addition to the ability to embed fonts, as part of the `postscript +`_ and `PDF +specification +`_ +there are 14 Core Font that compliant viewers must ensure are available. If +you restrict your document to only these fonts you do not have to embed any +font information in the document but still get vector text. This is especially helpful to generate *really lightweight* documents.:: @@ -119,70 +109,90 @@ This is especially helpful to generate *really lightweight* documents.:: fig.savefig("AFM_PDF.pdf", format="pdf") fig.savefig("AFM_PS.ps", format="ps) -.. note:: - These core fonts are limited to PDF and PS backends only; they can not be - rendered in other backends. - - Another downside to this is that while the font metrics are standardized, - different PDF viewer applications will have different fonts to render these - metrics. In other words, the **output might look different on different - viewers**, as well as (let's say) Windows and Linux, if Linux tools included - free versions of the proprietary fonts. - - This also violates the *what-you-see-is-what-you-get* feature of Matplotlib. - -Are we reinventing the wheel? ------------------------------ -Internally, a feasible response to the question of 'reinventing the -wheel would be, well, Yes *and No*. The font-matching algorithm used -by Matplotlib has been *inspired* by web browsers, more specifically, -`CSS Specifications `_. - -Currently, the simplest way (and the only way) to tell Matplotlib what fonts -you want it to use for your document is via the **font.family** rcParam, -see :doc:`Customizing text properties `. - -This is similar to how one tells a browser to use multiple font families -(specified in their order of preference) for their HTML webpages. By using -**font-family** in their stylesheet, users can essentially trigger a very -useful feature provided by browers, known as Font-Fallback. For example, the -following snippet in an HTML markup would: - -.. code-block:: html - - - - - - some text - - - -For every character/glyph in *"some text"*, the browser will iterate through -the whole list of font-families, and check whether that character/glyph is -available in that font-family. As soon as a font is found which has the -required glyph(s), the browser uses that font to render that character, and -subsequently moves on to the next character. - -How does Matplotlib achieve this? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Currently, Matplotlib can't render a multi-font document. It was initially -only designed to use a **single font** throughout the document, i.e., no matter -how many families you pass to **font.family** rcParam, Matplotlib would use the -very first font it's able to find on your system, and try to render all your -characters/glyphs from that *and only that* font. - -.. note:: - This is, because the internal font matching was written/adapted - from a very old `CSS1 spec `_, - **written in 1998**! - - However, allowing multiple fonts for a single document (also enabling - Font-Fallback) is one of the goals for 2021's Google Summer of Code project. - - `Read more on Matplotblog `_! +Fonts in SVG +------------ + +Text can output to SVG in two ways controlled by the :rc:`svg.fonttype` +rcparam: + +- as a path (``'path'``) in the SVG +- as string in the SVG with font styling on the element (``'none'``) + + +When saving via ``'path'`` Matplotlib will compute the path of the glyphs used +as vector paths and write those to the output. The advantage of this is that +the SVG will look the same on all computers independent of what fonts are +installed. However the text will not be editable after the fact. +In contrast saving with ``'none'`` will result in smaller files and the +text will appear directly in the markup. However, the appearance may vary +based on the SVG viewer and what fonts are available. + +Fonts in Agg +------------ + +To output text to raster formats via Agg Matplotlib relies on `FreeType +`_. Because the exactly rendering of the glyphs +changes between FreeType versions we pin to a specific version for our image +comparison tests. + + +How Matplotlib selects fonts +---------------------------- + +Internally using a Font in Matplotlib is a three step process: + +1. a `.FontProperties` object is created (explicitly or implicitly) +2. based on the `.FontProperties` object the methods on `.FontManager` are used + to select the closest the "best" font Matplotlib is aware of (except for + ``'none'`` mode of SVG). +3. the Python proxy for the font object is used by the backend code to render + the text -- the exact details depend on the backend via `.font_manager.get_font`. + +The algorithm to select the "best" font is a modified version of the algorithm +specified by the `CSS1 Specifications +`_ which is used by web browsers. +This algorithm takes into account the font family name (e.g. "Arial", "Noto +Sans CJK", "Hack", ...), the size, style, and weight. In addition to family +names that map directly to fonts there are five "generic font family names" ( +serif, monospace, fantasy, cursive, and sans-serif) that will internally be +mapped to any one of a set of fonts. + +Currently the public API for doing step 2 is `.FontManager.findfont` (and that +method on the global `.FontManager` instance is aliased at the module level as +`.font_manager.findfont`) will only find a single font and return the absolute +path to the font on the filesystem. + +Font Fallback +------------- + +There is no font that covers the unicode space thus it is possible for the +users to require a mix of glyphs that can not be satisfied from a single font. +While it has been possible to use multiple fonts within a Figure, on distinct +`.Text` instances, it was not previous possible to use multiple fonts in the +same `.Text` instance (as a web browser does). As of Matplotlib 3.6 the Agg, +SVG, PDF, and PS backends will "fallback" through multiple fonts in a single +`.Text` instance: + + +.. plot:: + :include-source: + :caption: The string "There are 几个汉字 in between!" rendered with 2 fonts. + + fig, ax = plt.subplots() + ax.text( + .5, .5, "There are 几个汉字 in between!", + family=['DejaVu Sans', 'WenQuanYi Zen Hei'], + ha='center' + ) + + +Internally this is implemented by setting The "font family" on +`.FontProperties` objects to a list of font families. Using a (currently) +private API extract a list of paths to all of the fonts found and then +construct a single `.ft2font.FT2Font` object that is aware of all of the fonts. +Each glyph of the string is rendered using the first font in the list that +contains that glyph. + +A majority of this work was done by Aitik Gupta supported by Google Summer of +Code 2021. diff --git a/tutorials/text/text_props.py b/tutorials/text/text_props.py index 7b0537a3ac26..a2acbd0bd27f 100644 --- a/tutorials/text/text_props.py +++ b/tutorials/text/text_props.py @@ -215,6 +215,8 @@ # matplotlib.rcParams['font.family'] = ['Family1', 'SerifFamily1', 'SerifFamily2', 'Family2'] # # +# .. _font-nonlatin: +# # Text with non-latin glyphs # ========================== # From 709500cadd7ab7d16180bcb0394661fca97679bd Mon Sep 17 00:00:00 2001 From: Thomas A Caswell Date: Tue, 16 Aug 2022 21:48:47 -0400 Subject: [PATCH 4/5] DOC: edits from review Co-authored-by: Elliott Sales de Andrade --- doc/users/explain/fonts.rst | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/doc/users/explain/fonts.rst b/doc/users/explain/fonts.rst index 7e57b520eaab..65197dce81a9 100644 --- a/doc/users/explain/fonts.rst +++ b/doc/users/explain/fonts.rst @@ -15,7 +15,7 @@ Matplotlib also provides an option to offload text rendering to a TeX engine (``usetex=True``), see :doc:`Text rendering with LaTeX `. -Fonts in PDF and postscript +Fonts in PDF and PostScript --------------------------- Fonts have a long (and sometimes incompatible) history in computing, leading to @@ -66,9 +66,9 @@ Other font specifications which Matplotlib supports: Font Subsetting ~~~~~~~~~~~~~~~ -PDF and postscript support embedded fonts in the output files allowing the +The PDF and PostScript formats support embedding fonts in files allowing the display program to correctly render the text, independent of what fonts are -installed on the viewer's computer, without the need to pre-rasterize the text. +installed on the viewer's computer and without the need to pre-rasterize the text. This ensures that if the output is zoomed or resized the text does not become pixelated. However, embedding full fonts in the file can lead to large output files, particularly with fonts with many glyphs such as those that support CJK @@ -87,7 +87,7 @@ Currently Type 3, Type 42, and TrueType fonts are subseted. Type 1 fonts are no Core Fonts ~~~~~~~~~~ -In addition to the ability to embed fonts, as part of the `postscript +In addition to the ability to embed fonts, as part of the `PostScript `_ and `PDF specification `_ @@ -113,8 +113,7 @@ This is especially helpful to generate *really lightweight* documents.:: Fonts in SVG ------------ -Text can output to SVG in two ways controlled by the :rc:`svg.fonttype` -rcparam: +Text can output to SVG in two ways controlled by :rc:`svg.fonttype`: - as a path (``'path'``) in the SVG - as string in the SVG with font styling on the element (``'none'``) @@ -131,8 +130,8 @@ based on the SVG viewer and what fonts are available. Fonts in Agg ------------ -To output text to raster formats via Agg Matplotlib relies on `FreeType -`_. Because the exactly rendering of the glyphs +To output text to raster formats via Agg, Matplotlib relies on `FreeType +`_. Because the exact rendering of the glyphs changes between FreeType versions we pin to a specific version for our image comparison tests. @@ -144,7 +143,7 @@ Internally using a Font in Matplotlib is a three step process: 1. a `.FontProperties` object is created (explicitly or implicitly) 2. based on the `.FontProperties` object the methods on `.FontManager` are used - to select the closest the "best" font Matplotlib is aware of (except for + to select the closest "best" font Matplotlib is aware of (except for ``'none'`` mode of SVG). 3. the Python proxy for the font object is used by the backend code to render the text -- the exact details depend on the backend via `.font_manager.get_font`. @@ -160,13 +159,13 @@ mapped to any one of a set of fonts. Currently the public API for doing step 2 is `.FontManager.findfont` (and that method on the global `.FontManager` instance is aliased at the module level as -`.font_manager.findfont`) will only find a single font and return the absolute +`.font_manager.findfont`), which will only find a single font and return the absolute path to the font on the filesystem. Font Fallback ------------- -There is no font that covers the unicode space thus it is possible for the +There is no font that covers the entire Unicode space thus it is possible for the users to require a mix of glyphs that can not be satisfied from a single font. While it has been possible to use multiple fonts within a Figure, on distinct `.Text` instances, it was not previous possible to use multiple fonts in the @@ -188,9 +187,9 @@ SVG, PDF, and PS backends will "fallback" through multiple fonts in a single Internally this is implemented by setting The "font family" on -`.FontProperties` objects to a list of font families. Using a (currently) -private API extract a list of paths to all of the fonts found and then -construct a single `.ft2font.FT2Font` object that is aware of all of the fonts. +`.FontProperties` objects to a list of font families. A (currently) +private API extracts a list of paths to all of the fonts found and then +constructs a single `.ft2font.FT2Font` object that is aware of all of the fonts. Each glyph of the string is rendered using the first font in the list that contains that glyph. From 449662909af13b43413701cdc21e2b17862418d5 Mon Sep 17 00:00:00 2001 From: Thomas A Caswell Date: Tue, 16 Aug 2022 21:50:11 -0400 Subject: [PATCH 5/5] DOC: flip order to be clear we do not vendor fonttools --- doc/users/explain/fonts.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/users/explain/fonts.rst b/doc/users/explain/fonts.rst index 65197dce81a9..b6a7baec2e52 100644 --- a/doc/users/explain/fonts.rst +++ b/doc/users/explain/fonts.rst @@ -77,9 +77,9 @@ files, particularly with fonts with many glyphs such as those that support CJK The solution to this problem is to subset the fonts used in the document and only embed the glyphs actually used. This gets both vector text and small files sizes. Computing the subset of the font required and writing the new -(reduced) font are both complex problem and thus Matplotlib relies on a -vendored fork of `ttconv `_ and `fontTools -`__. +(reduced) font are both complex problem and thus Matplotlib relies on +`fontTools `__ and a vendored fork +of `ttconv `_. Currently Type 3, Type 42, and TrueType fonts are subseted. Type 1 fonts are not.