-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Simplify definition of mathtext symbols & correctly end tokens in mathtext parsing #22950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Considering the doc-build failure: maybe one should also add some test in the main test suite for accents of the types |
Ah, good catch, fixed and added test. |
I would prefer if we re-gen the test images here. I think the |
Actually this reveals another bug: I deleted the ddots (etc.) test because they are now recognized as relation operators and extra spaces got added around them, but such spaces should actually not be there because the test string is Fixing this bug (which probably involves reusing something like the "Binary operators at start of string should not be spaced" part of the code in |
I went for the easier path of just adding |
Use a single regex that handles both single_symbol (a single character) and symbol_name (`\knowntexsymbolname`), and also slightly simplify the "end-of-symbol-name" regex. This parsing element comes up extremely often, and removing one indirection layers shaves off ~3-4% off drawing all the current mathtext tests, i.e. ``` MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)' ```
@tacaswell can you re-review to be sure your concerns are met? |
@anntzer it looks like you have still removed a bunch of baseline images. Are we sure those are still tested? |
Yes I am sure, this is only removing test 77, which checks that "accentprefixed" commands are correctly interpreted (e.g. \doteq is not interpreted as \dot eq), but this is essentially also covered by the |
It it worth an API change note on the spacing? |
Changelog entry added, also added dotminus to the spaced operators as it was clearly missing before. |
This avoids parsing `\sinx` as `\sin x` (it now raises an error instead), and removes the need for `accentprefixed` (because `\doteq` is treated as a single token now, instead of `\dot{eq}`). This also means that `\doteq` (and friends) are now correctly treated as relations (per `_relation_symbols`, thus changing the spacing around them); hence then change in baseline images. Adjust test strings accordingly to undo the spacing, to avoid regen'ing baselines. Also shaves ~2% off drawing all the current mathtext tests, i.e. ``` MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)' ``` (including adjustment for the removed test case), probably because accentprefixed was previously extremely commonly checked, being at the top of the placeable list; however, performance wasn't really the main goal here.
PR Summary
First commit: Simplify definition of mathtext symbols.
Use a single regex that handles both single_symbol (a single character)
and symbol_name (
\knowntexsymbolname
), and also slightly simplify the"end-of-symbol-name" regex.
This parsing element comes up extremely often, and removing one
indirection layers shaves off ~3-4% off drawing all the current mathtext
tests, i.e.
Second commit: Correctly end tokens in mathtext parsing.
This avoids parsing
\sinx
as\sin x
(it now raises an errorinstead), and removes the need for
accentprefixed
(because\doteq
is treated as a single token now, instead of
\dot{eq}
). This alsomeans that
\doteq
(and friends) are now correctly treated as relations(per
_relation_symbols
, thus changing the spacing around them); hencethen change in baseline images. Only keep the
x \doteq y
baseline(and adjust the test string to undo the spacing), to avoid regen'ing
baselines.
Also shaves ~2% off drawing all the current mathtext tests, i.e.
(including adjustment for the two removed test cases), probably because
accentprefixed was previously extremely commonly checked, being at the
top of the placeable list; however, performance wasn't really the main
goal here.
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).Use a single regex that handles both single_symbol (a single character)
and symbol_name (
\knowntexsymbolname
), and also slightly simplify the"end-of-symbol-name" regex.
This parsing element comes up extremely often, and removing one
indirection layers shaves off ~3-4% off drawing all the current mathtext
tests, i.e.