|
| 1 | +============================================= |
| 2 | + MEP30: Dummy types for common option selects |
| 3 | +============================================= |
| 4 | +.. contents:: |
| 5 | + :local: |
| 6 | + |
| 7 | + |
| 8 | +Status |
| 9 | +====== |
| 10 | + |
| 11 | +- **Discussion**: The MEP is being actively discussed on the mailing |
| 12 | + list and it is being improved by its author. The mailing list |
| 13 | + discussion of the MEP should include the MEP number (MEPxxx) in the |
| 14 | + subject line so they can be easily related to the MEP. |
| 15 | + |
| 16 | +Branches and Pull requests |
| 17 | +========================== |
| 18 | + |
| 19 | +Abstract |
| 20 | +======== |
| 21 | + |
| 22 | +There is currently a family of matplotlib concepts whose documentation is |
| 23 | +contained almost exclusively within the docstrings of functions which take |
| 24 | +arguments of that type (and sometimes in tutorials). Common examples are |
| 25 | +``linestyle``, ``joinstyle/capstyle``, and ``bounds/extents`` (for a full list, |
| 26 | +see below). I will call these not-quite-types "pseudotypes". |
| 27 | + |
| 28 | +As some of these pseudotypes became used by more and more functions, their |
| 29 | +documentation has become fractured across the various files that use them. For |
| 30 | +example, the ``linestyle`` parameter is accepted in many places, including |
| 31 | +the Line2D constructor, Axes methods, various `~.matplotlib.collections` |
| 32 | +classes, and of course in rc. While `~.matplotlib.lines.Line2D` fully documents |
| 33 | +it only some Axes methods link to this documentation, others simply hint at the |
| 34 | +available options. |
| 35 | + |
| 36 | +Input checking for these pseudotypes tends to be repeated across many files and |
| 37 | +is too easy to do incorrectly or inconsistently. For example, the ``joinstyle`` |
| 38 | +and ``capstyle`` parameters have validators in ``rcsetup.py``. However, while |
| 39 | +these are used in ``patches.py`` and ``collections.py``, they are not used in |
| 40 | +``markers.py``, and ``backend_bases.py`` calls ``cbook._check_in_list`` with its |
| 41 | +own list of possible valid joinstyles. |
| 42 | + |
| 43 | +In order to prevent further fragmentation of docs and validation, I propose that |
| 44 | +each such concept get a proper new-style class, where we can centralize its |
| 45 | +documentation. All functions that accept such an argument will then easily be |
| 46 | +able to link to it using the standard numpydoc syntax in their docstrings, and |
| 47 | +the description of these parameters can instead be changed to point to relevant |
| 48 | +tutorials, instead of an ad-hoc rehashing of already existing documentation. |
| 49 | +Error checking would be centralized to that class instead of being scattered |
| 50 | +throughout several `cbook._check_in_list` calls that are liable to become stale. |
| 51 | + |
| 52 | +Some benefits of this approach include: |
| 53 | + |
| 54 | +1. Less likely for docs to become stale, due to centralization. |
| 55 | +2. Increased discoverability of advanced options. If the simple linestyle option |
| 56 | + ``'-'`` is documented alongside more complex on-off dash specifications, |
| 57 | + users are more likely to scroll down than they are to stumble across an |
| 58 | + unlinked-to tutorial that describes a feature they need. |
| 59 | +3. Canonicalization of many of matplotlib's "implicit standards" (like what is a |
| 60 | + "bounds" versus and "extents") that currently have to be learned by reading |
| 61 | + the code. |
| 62 | +4. The process would likely highlight issues with API consistency in a way that |
| 63 | + could be more easily tracked via Issues, helping with the process of |
| 64 | + improving our API (see below for discussion). |
| 65 | +5. Becoming more compatible with potentially adding typing to the library. |
| 66 | +6. Faster doc build times, due to significant decreases in the amount of |
| 67 | + text needing to be parsed. |
| 68 | + |
| 69 | + |
| 70 | +Detailed description |
| 71 | +==================== |
| 72 | + |
| 73 | +Historically, matplotlib's API has relied heavily on string-as-enum |
| 74 | +"pseudotypes". Besides mimicking matlab's API, these parameter-strings allow the |
| 75 | +user to pass semantically-rich values as arguments to matplotlib functions |
| 76 | +without having to explicitly import or verbosely prefix an actual enum value |
| 77 | +just to pass basic plot options (i.e. ``plt.plot(x, y, linestyle='solid')`` is |
| 78 | +easier to type and less redundant than ``plt.plot(x, y, |
| 79 | +linestyle=mpl.LineStyle.solid)``). |
| 80 | + |
| 81 | +Many of these string-as-enum pseudotypes have since evolved more sophisticated |
| 82 | +features. For example, a ``linestyle`` can now be either a string or a 2-tuple |
| 83 | +of sequences, and a MarkerStyle can now be either a string or a path. While this |
| 84 | +is true of many pseudotypes, MarkerStyle is the only one (to my knowledge) that |
| 85 | +has the status of being a proper Python type. |
| 86 | + |
| 87 | +Because psuedotypes are not classes in their own right, Matplotlib has |
| 88 | +historically had to roll its own solutions for centralizing documentation and |
| 89 | +validation of these pseudotypes (e.g. the ``docstring.interpd.update`` docstring |
| 90 | +interpolation pattern and the ``cbook._check_in_list`` validator pattern, |
| 91 | +respectively) instead of using the standard toolchains. |
| 92 | + |
| 93 | +While these solutions have worked well for us, the lack of an explicit location |
| 94 | +to document each pseudotype means that the documentation is often difficult to |
| 95 | +find, large tables of allowed values are repeated throughout the documentation, |
| 96 | +and often an explicit statement of the *scope* of a pseudotype is completely |
| 97 | +missing from the docs. Take the ``plt.plot`` docs, for example. In the "Notes", |
| 98 | +a description of the matlab-like format-string styling method mentions |
| 99 | +``linestyle``, ``color``, and ``markers`` options. There are many more ways to |
| 100 | +pass these three values than are hinted at, but, for many users, this is their |
| 101 | +only source of understanding about what values are possible for those options |
| 102 | +until they stumble on one of the relevant tutorials. In the table of ``Line2D`` |
| 103 | +attributes, the ``linestyle`` entry does a good job of linking to |
| 104 | +``Line2D.set_linestyle`` where those options are described, but the ``color`` |
| 105 | +and ``markers`` entries do not. ``color`` simply links to ``Line2D.set_color``, |
| 106 | +which does nothing in the way of offering intuition on what kinds of inputs are |
| 107 | +allowed. |
| 108 | + |
| 109 | +.. It can be argued that ``plt.plot`` is a good candidate to be explicitly |
| 110 | + excempted from any documentation best practices we try to codify, and I've |
| 111 | + chosen it intentionally to elicit the strongest opinions from everyone. |
| 112 | +
|
| 113 | +It could be argued that this is something that can be fixed by simply tidying up |
| 114 | +the individual docstrings that are causing problems, but the issue is |
| 115 | +unfortunately much more systemic than that. Without a centralized place to find |
| 116 | +the documentation, this will simply lead to us having more and more copies of |
| 117 | +increasingly verbose documentation repeated everywhere each of these pseudotypes |
| 118 | +is used. The alternative, of scattering the information throughout the |
| 119 | +documentation, will instead lead to the users having to slowly piece together |
| 120 | +their mental model of each pseudotype through wiki-diving style traversal |
| 121 | +throughout our documentation, or piecemeal from StackOverflow examples. |
| 122 | + |
| 123 | +Ideally, a mention of ``linestyle`` in the ``LineCollection`` docs should |
| 124 | +instead link to the same place as it does in the ``plt.plot`` docs. By |
| 125 | +organizing these ``linestyle``-specific docs in order from most-common to |
| 126 | +most-complex input types, we can maintain a "single-click-to-discover" property |
| 127 | +for our advanced plotting options, while also making sure that we don't hurt |
| 128 | +usability for users that simply want to know the simplest way to accomplish a |
| 129 | +common task. |
| 130 | + |
| 131 | +Practically speaking, the actual information that we want to have in the |
| 132 | +``LineCollection`` docs is just: |
| 133 | + |
| 134 | +1. A link to complete docs for allowable inputs (like those found in |
| 135 | + ``Line2D.set_linestyle``). |
| 136 | +2. A plain words description of what the parameter is meant to accomplish. To |
| 137 | + matplotlib power users, this is evident from the parameter's name, but for |
| 138 | + new users this need not be the case. (e.g. ``linestyle: a description of |
| 139 | + whether the stroke used to draw each line in the collection is dashed, dotted |
| 140 | + or solid``). |
| 141 | +3. A link to any tutorials that visually depict the possible options (currently |
| 142 | + found only after already clicking through to the ``Line2D.set_linestyle`` |
| 143 | + docs). |
| 144 | + |
| 145 | +In order to make this information available for all pseudotypes, helping the |
| 146 | +continued improval of the consistency and readability of the docs, we propose |
| 147 | +the following best-practices for handling pseudotypes: |
| 148 | + |
| 149 | +0. Pseudotype documentation should be centralized at a dedicated class |
| 150 | + definition. |
| 151 | +1. Functions that accept pseudotype values should link to the appropriate |
| 152 | + pseudotype class docs. |
| 153 | +2. Validation should always happen, but only at the point of usage (i.e. |
| 154 | + immediately before any operation that could raise or produce an error if the |
| 155 | + value is invalid). |
| 156 | +3. If a pseudotype is a "string-as-enum", each possible value should have a |
| 157 | + Sphinx-parseable documentation string. |
| 158 | +4. If applicable, individual classmethods should be written to construct a |
| 159 | + pseudotype from each of various input possibilities, one per possible input |
| 160 | + type. Obviously, ``__init__`` should delegate to these when possible. |
| 161 | + |
| 162 | +In particular, notice that (1) would replace large copies of |
| 163 | +tables of possible linestyles, markerstyles, etc, with links to the complete |
| 164 | +documentation for each. Without all the visual noise from these tables of valid |
| 165 | +options, the relevant functions would be free to visibly link to tutorials where |
| 166 | +these options are visually demonstrated. |
| 167 | + |
| 168 | +This section describes the need for the MEP. It should describe the |
| 169 | +existing problem that it is trying to solve and why this MEP makes the |
| 170 | +situation better. It should include examples of how the new |
| 171 | +functionality would be used and perhaps some use cases. |
| 172 | + |
| 173 | +Implementation |
| 174 | +============== |
| 175 | + |
| 176 | +This proposal would add one class per pseudotype. For types with complex |
| 177 | +construction requirements, we would produce and use classmethods for explicit |
| 178 | +construction from a known type, but ``__init__`` would continue to hold the |
| 179 | +logic required to deduce how to construct the type from the type of the input. |
| 180 | + |
| 181 | +All functions that accept this pseudotype as a parameter would have their |
| 182 | +docstrings changed to simply use the numpydoc "input type" syntax to link to |
| 183 | +this new class. All functions which *use* this pseudotype (i.e. would raise on |
| 184 | +an invalid input) would construct an explicit object instance using the general |
| 185 | +``__init__``, allowing the new class to handle validation. |
| 186 | + |
| 187 | +The pseudotypes that I propose require new style classes are: |
| 188 | + |
| 189 | +1. ``linestyle`` |
| 190 | +2. ``capstyle`` |
| 191 | +3. ``joinstyle`` |
| 192 | +4. ``bounds`` |
| 193 | +5. ``extents`` |
| 194 | +6. ``capstyle`` |
| 195 | + |
| 196 | +Backward compatibility |
| 197 | +====================== |
| 198 | + |
| 199 | +This proposal does not break backward compatibility, since the class's |
| 200 | +constructor will explicitly be designed to take the same values as were |
| 201 | +previously allowed. |
| 202 | + |
| 203 | +Alternatives |
| 204 | +============ |
| 205 | + |
| 206 | +Instead of making new classes, we can comb through each of the pseudotypes |
| 207 | +listed above and choose a single place for the validation to go, documenting |
| 208 | +this for discoverability (for example, the only realistic way to discover that |
| 209 | +``validate_joinstyle`` exists currently is to ``grep`` for ``joinstyle`` and |
| 210 | +find it serendipidously). To fix documentation redundancy, we could use Sphinx's |
| 211 | +powerful linking capability to make sure that each pseudotype is only documented |
| 212 | +once (by the class that "owns"/validates it), with all other documentation |
| 213 | +linking to that location. This pattern would probably require documentation in |
| 214 | +the developer docs. |
0 commit comments