|
| 1 | +--- |
| 2 | +title: "GSoC'21: Quarter Progress" |
| 3 | +date: 2021-08-03T18:48:00+05:30 |
| 4 | +draft: false |
| 5 | +categories: ["News", "GSoC"] |
| 6 | +description: "Quarter Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta" |
| 7 | +displayInList: true |
| 8 | +author: Aitik Gupta |
| 9 | + |
| 10 | +resources: |
| 11 | +- name: featuredImage |
| 12 | + src: "AitikGupta_GSoC.png" |
| 13 | + params: |
| 14 | + showOnTop: true |
| 15 | +--- |
| 16 | + |
| 17 | +**“<ins>Matplotlib, I want 多个汉字 in between my text.</ins>”** |
| 18 | + |
| 19 | +Let's say you asked Matplotlib to render a plot with some label containing 多个汉字 (multiple Chinese characters) in between your English text. |
| 20 | + |
| 21 | +Or conversely, let's say you use a Chinese font with Matplotlib, but you had English text in between (which is quite common). |
| 22 | + |
| 23 | +> Assumption: the Chinese font doesn't have those English glyphs, and vice versa |
| 24 | +
|
| 25 | +With this short writeup, I'll talk about how does a migration from a font-first to a text-first approach in Matplotlib looks like, which ideally solves the above problem. |
| 26 | +### Have the fonts? |
| 27 | +Logically, the very first step to solving this would be to ask whether you _have_ multiple fonts, right? |
| 28 | + |
| 29 | +Matplotlib doesn't ship [CJK](https://en.wikipedia.org/wiki/List_of_CJK_fonts) (Chinese Japanese Korean) fonts, which ideally contains these Chinese glyphs. It does try to cover most grounds with the [default font](https://matplotlib.org/stable/users/dflt_style_changes.html#normal-text) it ships with, however. |
| 30 | + |
| 31 | +So if you don't have a font to render your Chinese characters, go ahead and install one! Matplotlib will find your installed fonts (after rebuilding the cache, that is). |
| 32 | +### Parse the fonts |
| 33 | +This is where things get interesting, and what my [previous writeup](https://matplotlib.org/matplotblog/posts/gsoc_2021_prequarter/) was all about.. |
| 34 | + |
| 35 | +> Parsing the whole family to get multiple fonts for given font properties |
| 36 | +
|
| 37 | +## FT2Font Magic! |
| 38 | +To give you an idea about how things used to work for Matplotlib: |
| 39 | +1. A single font was chosen _at draw time_ |
| 40 | + (fixed: re [previous writeup]((https://matplotlib.org/matplotblog/posts/gsoc_2021_prequarter/))) |
| 41 | +2. Every character displayed in your document was rendered by only that font |
| 42 | + (partially fixed: re <ins>_this writeup_</ins>) |
| 43 | + |
| 44 | +> FT2Font is a matplotlib-to-font module, which provides high-level Python API to interact with a _single font's operations_ like read/draw/extract/etc. |
| 45 | +
|
| 46 | +Being written in C++, the module needs wrappers around it to be converted into a [Python extension](https://docs.python.org/3/extending/extending.html) using Python's C-API. |
| 47 | + |
| 48 | +> It allows us to use C++ functions directly from Python! |
| 49 | +
|
| 50 | +So wherever you see a use of font within the library (by library I mean the readable Python codebase XD), you could have derived that: |
| 51 | +``` |
| 52 | +FT2Font === SingleFont |
| 53 | +``` |
| 54 | + |
| 55 | +Things are be a bit different now however.. |
| 56 | +## Designing a multi-font system |
| 57 | +FT2Font is basically itself a wrapper around a library called [FreeType](https://www.freetype.org/), which is a freely available software library to render fonts. |
| 58 | + |
| 59 | +<p align="center"> |
| 60 | + <figure> |
| 61 | + <img src="https://user-images.githubusercontent.com/43996118/128352387-76a3f52a-20fc-4853-b624-0c91844fc785.png" alt="FT2Font Naming" /> |
| 62 | + <figcaption style="text-align: center; font-style: italic;">How FT2Font was named</figcaption> |
| 63 | + </figure> |
| 64 | +</p> |
| 65 | + |
| 66 | +In my initial proposal.. while looking around how FT2Font is structured, I figured: |
| 67 | +``` |
| 68 | +Oh, looks like all we need are Faces! |
| 69 | +``` |
| 70 | +> If you don't know what faces/glyphs/ligatures are, head over to why [Text Hates You](https://gankra.github.io/blah/text-hates-you/). I can guarantee you'll definitely enjoy some real life examples of why text rendering is hard. 🥲 |
| 71 | +
|
| 72 | +Anyway, if you already know what Faces are, it might strike you: |
| 73 | + |
| 74 | +If we already have all the faces we need from multiple fonts (let's say we created a child of FT2Font.. which only <ins>tracks the faces</ins> for its families), we should be able to render everything from that parent FT2Font right? |
| 75 | + |
| 76 | +As I later figured out while finding segfaults in implementing this design: |
| 77 | +``` |
| 78 | +Each FT2Font is linked to a single FT_Library object! |
| 79 | +``` |
| 80 | + |
| 81 | +If you tried to load the face/glyph/character (basically anything) from a different FT2Font object.. you'll run into serious segfaults. (because one object linked to an `FT_Library` can't really access another object which has it's own `FT_Library`) |
| 82 | +```cpp |
| 83 | +// face is linked to FT2Font; which is |
| 84 | +// linked to a single FT_Library object |
| 85 | +FT_Face face = this->get_face(); |
| 86 | +FT_Get_Glyph(face->glyph, &placeholder); // works like a charm |
| 87 | + |
| 88 | +// somehow get another FT2Font's face |
| 89 | +FT_Face family_face = this->get_family_member()->get_face(); |
| 90 | +FT_Get_Glyph(family_face->glyph, &placeholder); // segfaults! |
| 91 | +``` |
| 92 | +
|
| 93 | +Realizing this took a good amount of time! After this I quickly came up with a recursive approach, wherein we: |
| 94 | +1. Create a list of FT2Font objects within Python, and pass it down to FT2Font |
| 95 | +2. FT2Font will hold pointers to its families via a \ |
| 96 | + `std::vector<FT2Font *> fallback_list` |
| 97 | +3. Find if the character we want is available in the current font |
| 98 | + 1. If the character is available, use that FT2Font to render that character |
| 99 | + 2. If the character isn't found, go to step 3 again, but now iterate through the `fallback_list` |
| 100 | +4. That's it! |
| 101 | +
|
| 102 | +A quick overhaul of the above piece of code^ |
| 103 | +```cpp |
| 104 | +bool ft_get_glyph(FT_Glyph &placeholder) { |
| 105 | + FT_Error not_found = FT_Get_Glyph(this->get_face(), &placeholder); |
| 106 | + if (not_found) return False; |
| 107 | + else return True; |
| 108 | +} |
| 109 | +
|
| 110 | +// within driver code |
| 111 | +for (uint i=0; i<fallback_list.size(); i++) { |
| 112 | + // iterate through all FT2Font objects |
| 113 | + bool was_found = fallback_list[i]->ft_get_glyph(placeholder); |
| 114 | + if (was_found) break; |
| 115 | +} |
| 116 | +``` |
| 117 | + |
| 118 | +With the idea surrounding this implementation, the [Agg backend](https://matplotlib.org/stable/api/backend_agg_api.html) is able to render a document (either through GUI, or a PNG) with multiple fonts! |
| 119 | + |
| 120 | +<p align="center"> |
| 121 | + <figure> |
| 122 | + <img src="https://user-images.githubusercontent.com/43996118/128347495-1f4f858d-33d3-4119-8732-5b26c4e9ca2a.png" alt="ChineseInBetween" /> |
| 123 | + <figcaption style="text-align: center; font-style: italic;">PNG straight outta Matplotlib!</figcaption> |
| 124 | + </figure> |
| 125 | +</p> |
| 126 | + |
| 127 | +## Python C-API is hard, at first! |
| 128 | +I've spent days at Python C-API's [argument doc](https://docs.python.org/3/c-api/arg.html), and it's hard to get what you need at first, ngl. |
| 129 | + |
| 130 | +But, with the help of some amazing people in the GSoC community ([@srijan-paul](https://srijan-paul.github.io/), [@atharvaraykar](https://atharvaraykar.me/)) and amazing mentors, blockers begone! |
| 131 | + |
| 132 | +## So are we done? |
| 133 | +Oh no. XD |
| 134 | + |
| 135 | +Things work just fine for the Agg backend, but to generate a PDF/PS/SVG with multiple fonts is another story altogether! I think I'll save that for later. |
| 136 | + |
| 137 | +<p align="center"> |
| 138 | + <figure> |
| 139 | + <img src="https://user-images.githubusercontent.com/43996118/128350093-13695b91-5ad2-4f96-91f5-8373ee7a189e.gif" alt="ThankYouDwight" /> |
| 140 | + <figcaption style="text-align: center; font-style: italic;">If you've been following the progress so far, mayn you're awesome!</figcaption> |
| 141 | + </figure> |
| 142 | +</p> |
| 143 | + |
| 144 | +#### NOTE: This blog post is also available at my [personal website](https://aitikgupta.github.io/gsoc-quarter/). |
0 commit comments