|
| 1 | +--- |
| 2 | +title: "GSoC'21: Pre-Quarter Progress" |
| 3 | +date: 2021-07-19T07:32:05+05:30 |
| 4 | +draft: false |
| 5 | +categories: ["News", "GSoC"] |
| 6 | +description: "Pre-Quarter Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta" |
| 7 | +displayInList: true |
| 8 | +author: Aitik Gupta |
| 9 | + |
| 10 | +resources: |
| 11 | +- name: featuredImage |
| 12 | + src: "AitikGupta_GSoC.png" |
| 13 | + params: |
| 14 | + showOnTop: true |
| 15 | +--- |
| 16 | + |
| 17 | +**“<ins>Well? Did you get it working?!</ins>”** |
| 18 | + |
| 19 | +Before I answer that question, if you're missing the context, check out my previous blog's last few lines.. promise it won't take you more than 30 seconds to get the whole problem! |
| 20 | + |
| 21 | +With this short writeup, I intend to talk about _what_ we did and _why_ we did, what we did. XD |
| 22 | + |
| 23 | +## Ostrich Algorithm |
| 24 | +Ring any bells? Remember OS (Operating Systems)? It's one of the core CS subjects which I bunked then and regret now. (╥﹏╥) |
| 25 | + |
| 26 | +The [wikipedia page](https://en.wikipedia.org/wiki/Ostrich_algorithm) has a 2-liner explaination if you have no idea what's an Ostrich Algorithm.. but I know most of y'all won't bother clicking it XD, so here goes: |
| 27 | +> Ostrich algorithm is a strategy of ignoring potential problems by "sticking one's head in the sand and pretend there is no problem" |
| 28 | +
|
| 29 | +An important thing to note: it is used when it is more **cost-effective** to _allow the problem to occur than to attempt its prevention_. |
| 30 | + |
| 31 | +As you might've guessed by now, we ultimately ended up with the *not-so-clean* API (more on this later). |
| 32 | + |
| 33 | +## What was the problem? |
| 34 | +The highest level overview of the problem was: |
| 35 | + |
| 36 | +``` |
| 37 | +❌ library1 -> buffer -> library2_with_buffer |
| 38 | +✅ library1 -> buffer -> tempfile -> library2_with_file |
| 39 | +``` |
| 40 | +The first approach created corrupted outputs, however the second approach worked fine. A point to note here would be that*Method 1* is better in terms of separation of *reading* the file from *parsing* the data. |
| 41 | + |
| 42 | +1. `library1` is [fontTools](https://github.com/fonttools/fonttools), whereas `library2` is [ttconv](https://github.com/matplotlib/matplotlib/tree/master/extern/ttconv). |
| 43 | +2. `library2_with_buffer` is <ins>ttconv</ins>, but modified to input a file buffer instead of a file-path |
| 44 | + |
| 45 | +You might be tempted to say: |
| 46 | +> "Well, `library2_with_buffer` must be wrongly modified, duh." |
| 47 | +
|
| 48 | +Logically, yes. `ttconv` was designed to work with a file-path and not a file-object (buffer), and modifying a codebase **written in 1998** turned out to be a larger pain than we anticipated. |
| 49 | +#### It came to a point where one of my mentors decided to implement everything in Python! |
| 50 | +He even did, but <ins>the efforts</ins> to get it to production / or to fix `ttconv` embedding were ⋙ to just get on with the second method. That damn ostrich really helped us get out of that debugging hell. 🙃 |
| 51 | +## Font Fallback - initial steps |
| 52 | +Finally, we're onto the second subgoal for the summer: [Font Fallback](https://www.w3schools.com/css/css_font_fallbacks.asp)! |
| 53 | + |
| 54 | +To give an idea about how things work right now: |
| 55 | +1. User asks Matplotlib to use certain font families, specified by: |
| 56 | +```python |
| 57 | +matplotlib.rcParams["font-family"] = ["list", "of", "font", "families"] |
| 58 | +``` |
| 59 | +2. This list is used to search for available fonts on a user's system. |
| 60 | +3. However, in current (and previous) versions of Matplotlib: |
| 61 | +> <ins>As soon as a font is found by iterating the font-family, **all text** is rendered by that _and only that_ font.</ins> |
| 62 | +
|
| 63 | +You can immediately see the problems with this approach; using the same font for every character will not render any glyph which isn't present in that font, and will instead spit out a square rectangle called "tofu" (read the first line [here](https://www.google.com/get/noto/)). |
| 64 | + |
| 65 | +And that is exactly the first milestone! That is, parsing the <ins>_entire list_</ins> of font families to get an intermediate representation of a multi-font interface. |
| 66 | +## Don't break, a lot at stake! |
| 67 | +Imagine if you had the superpower to change Python standard library's internal functions, _without_ consulting anybody. Let's say you wanted to write a solution by hooking in and changing, let's say `str("dumb")` implementation by returning: |
| 68 | +```ipython |
| 69 | +>>> str("dumb") |
| 70 | +["d", "u", "m", "b"] |
| 71 | +``` |
| 72 | +Pretty "<ins>dumb</ins>", right? xD |
| 73 | + |
| 74 | +For your usecase it might work fine, but it would also mean breaking the _entire_ Python userbase' workflow, not to mention the 1000000+ libraries that depend on the original functionality. |
| 75 | + |
| 76 | +On a similar note, Matplotlib has a public API known as `findfont(prop: str)`, which when given a string (or [FontProperties](https://matplotlib.org/stable/api/font_manager_api.html#matplotlib.font_manager.FontProperties)) finds you a font that best matches the given properties in your system. |
| 77 | + |
| 78 | +It is used <ins>throughout the library</ins>, as well as at multiple other places, including downstream libraries. Being naive as I was, I changed this function signature and submitted the [PR](https://github.com/matplotlib/matplotlib/pull/20496). 🥲 |
| 79 | + |
| 80 | +Had an insightful discussion about this with my mentors, and soon enough raised the [other PR](https://github.com/matplotlib/matplotlib/pull/20549), which didn't touch the `findfont` API at all. |
| 81 | + |
| 82 | +--- |
| 83 | + |
| 84 | +One last thing to note: Even if we do complete the first milestone, we wouldn't be done yet, since this is just parsing the entire list to get multiple fonts.. |
| 85 | + |
| 86 | +We still need to migrate the library's internal implementation from **font-first** to **text-first**! |
| 87 | + |
| 88 | + |
| 89 | +But that's for later, for now: |
| 90 | + |
| 91 | + |
| 92 | +#### NOTE: This blog post is also available at my [personal website](https://aitikgupta.github.io/gsoc-pre-quarter/). |
0 commit comments