-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[Bug]: Pyparsing 3.1 breaks tests #26152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Possibly related to pyparsing/pyparsing#474 Have not proven it, but that has to do with handling escape sequences and the errors appear to all be related to I'll also note that pyparsing handling of |
Okay, I have bisected/tested that this change is what is causing our test failures. I will note that it is only the error messages that change (and so anything that does not raise will still work) Essentially we were relying on the "pass through" of error messages to provide actually useful error messages (and testing that we got them) @ptmcg, perhaps you can advise us on how to respond? It is unclear to me (at least quickly) if there is a good way to get the errors from further down when we want it. |
I am not able to readily set up my own development environment for running these tests. Can someone please post (or link to a pastebin) a full test output, including expected and observed values?
I think I was striving for improving the error messages as part of this effort. |
These are the failing tests: matplotlib/lib/matplotlib/tests/test_mathtext.py Lines 279 to 346 in 060992a
Here is the full traceback of one of them:
All other looks very similar. |
That particular exception we raise here: matplotlib/lib/matplotlib/_mathtext.py Line 2294 in 060992a
|
Essentially we had been getting informative error messages and now just get "Expected token (at char 0)" for any failure to parse, which is not informative. Reverting the try/except linked above fixes our tests |
I'm still trying to setup a development environment of my own following the matplotlib dev guide, but now getting TypeErrors with FT2Font something something. What happens if you stop calling def set_names_and_parse_actions():
for key, val in vars(p).items():
if not key.startswith('_'):
# Set names on everything -- very useful for debugging
if key != "token": # <-- don't obscure token contents with unhelpful "token" name
val.setName(key)
# Set actions
if hasattr(self, key):
val.setParseAction(getattr(self, key)) Or possibly change the if statement to any attribute whose value is a pyparsing Literal or Keyword, which would probably give clearer-looking exceptions than using your internal variable name: if not isinstance(val.expr, (Literal, Keyword)): |
tl;dr - This change has been in staging and pre-releases for over a year before being released, and I announced (in Twitter and comp.lang.python.announce) the coming 3.1.0 release of pyparsing in alpha and beta releases in March, April, and May of this year. In general, I feel that this particular change improves logging for expressions that contain other expressions when the container expression has been assigned a custom name, so I do not plan on reverting this change. It saddens me that there is this post-release surprise issue with a pyparsing user as significant as matplotlib. As a workaround, you can see what the effects are by not calling I also have tried to spin up my own development environment of matplotlib, so that I could run your tests against my in-development pyparsing, but with no success (both Windows and Ubuntu). I welcome any assistance in doing this, so that I can run matplotlib regression tests for myself before releasing new versions of pyparsing. Some possible changes I could make in a future pyparsing release:
I have pressing deadlines at work for the next month, so this future release will not be for several weeks yet. Also, I would like to give 3.1.0 some more time in the field, to get any other fallout from other users. @ksunden - I tested this version of pyparsing with Python 3.12, so I believe I've addressed any issues with |
The guess at I will focus discussion in the interim on one particular path, as I suspect that once we find a path for one, the rest will fall into line with relatively trivial modifications. So let's focus on Unknown symbol is It's parsing function always raises: def unknown_symbol(self, s, loc, toks):
raise ParseFatalException(s, loc, f"Unknown symbol: {toks['name']}") As such, our expected behavior is that when such a symbol is parsed, we should get an error message that points directly to the unknown symbol and says A simple test case, extracting from the pytest markup: import matplotlib.mathtext
parser = matplotlib.mathtext.MathTextParser("agg")
parser.parse(r"$\sinx$") This should raise a ValueError:
\sinx
^
ParseFatalException: Unknown symbol: \sinx, found '\' (at char 0), (line:1, col:1) With pyparsing 3.1.0 and no modifications to matplotlib this gives: ValueError:
\sinx
^
ParseFatalException: Expected token, found '\' (at char 0), (line:1, col:1) Which does not include the key phrase we are looking for in out tests, and is generally not a helpful error message in our usage. Excluding ValueError:
\sinx
^
ParseFatalException: Forward: {{simple | auto_delim} | unknown_symbol}, found '\' (at char 0), (line:1, col:1) That is the internal definition of If I modify pyparsing a little more minimally such that instead of: try:
return self.expr._parse(instring, loc, doActions, callPreParse=False)
except ParseBaseException as pbe:
pbe.msg = self.errmsg
raise I do (i.e. do not reset the error message for Forward instances): try:
return self.expr._parse(instring, loc, doActions, callPreParse=False)
except ParseBaseException as pbe:
if not isinstance(self, Forward):
pbe.msg = self.errmsg
raise Then the matplotlib tests pass (even without the modification so that The "depth" argument for "explain" does not seem to affect the output of these messages (well, other than adding a trailing line of RE your dev setup for matplotlib, happy to help if I can, but not sure what could be going wrong. RE preventing surprises post-release in the future, we do have a test setup which pulls nightly builds of numpy/pandas (from here). We could probably modify that to get either at least |
Also, we newly have github codespaces set up to get a dev setup with VSCode through the browser, while we are still working on some of the docs for using it, the config files are there, you may be able to get something good enough to run some quick tests on through that. |
Perhaps the other heuristic which may be more in line with what you were thinking would happen by not calling try:
return self.expr._parse(instring, loc, doActions, callPreParse=False)
except ParseBaseException as pbe:
if self.customName is not None:
pbe.msg = self.errmsg
raise Which would prevent autogen names from showing (unless they are the leaf). |
Your code fragment also leads me to another thought, since the issue seems not to be so much about set_name (as I originally thought), but about the custom message in your raised exception. What if you raised your ParseFatalExceptions as a subclass of ParseFatalException, and I only do the substitution if the exception is of a type known to pyparsing (i.e., ParseBaseException, ParseException, ParseFatalExecption, or ParseSyntaxException). I've seen this exception style (don't raise library exceptions, raise your own subclass of them) recommended in other sources. |
Something like this: try:
return self.expr._parse(instring, loc, doActions, callPreParse=False)
except ParseBaseException as pbe:
# only update the exception message if it is one of the built-in types (not any user-defined subclass)
if type(pbe) in (ParseBaseException, ParseException, ParseFatalException, ParseSyntaxException):
pbe.msg = self.errmsg
raise |
This would then be a documented mechanism for client apps to emit their own special error messages, by subclassing from one of the pyparsing exception types. (Sorry, that took me a few tries to get right. Explicit testing for types is just not natural for me!) |
Unfortunately that only solves 7/26 of our test cases. (The particular example of Unknown Symbol is one of the ones fixed) All of the ones that fail in parsing out parameters to LaTeX functions (e.g. What about setting Then it becomes on our side: diff --git a/lib/matplotlib/_mathtext.py b/lib/matplotlib/_mathtext.py
index 811702f1cd..730a82e45d 100644
--- a/lib/matplotlib/_mathtext.py
+++ b/lib/matplotlib/_mathtext.py
@@ -1819,6 +1819,8 @@ class Parser:
if not key.startswith('_'):
# Set names on everything -- very useful for debugging
val.setName(key)
+ if isinstance(val, Forward):
+ val.errmsg = None
# Set actions
if hasattr(self, key):
val.setParseAction(getattr(self, key)) And on pyparsing's side: diff --git a/pyparsing/core.py b/pyparsing/core.py
index 8233f72..e83dcda 100644
--- a/pyparsing/core.py
+++ b/pyparsing/core.py
@@ -4547,7 +4547,8 @@ class ParseElementEnhance(ParserElement):
try:
return self.expr._parse(instring, loc, doActions, callPreParse=False)
except ParseBaseException as pbe:
- pbe.msg = self.errmsg
+ if self.errmsg is not None:
+ pbe.msg = self.errmsg
raise
else:
raise ParseException(instring, loc, "No expression defined", self) (plus docs/type hints/tests/possibly other handling around This gives us: a) opt in to "pass through" Also open to other sentinels (including a new variable, perhaps?), that was just one that occurred to me. |
I guess that diff begs the question of "why set names if you just set errmsg to be empty?" And I think I'm personally comeng back to arguing that perhaps
I think we would then just need to find the subset of our |
Thanks for the extended thought process on this. When I get a chance I'll revisit that dev guide to see which steps I missed. |
Matplotlib currently forbids our version of pyparsing (3.1.0). The issue only affects error messages. This update hacks out the pyparsing < 3.1.0 requirement so that matplotlib dependents can at least function, though some error messages won't pass through properly. matplotlib/matplotlib#26152
This is marked 3.7.2 RC. Is there a path forward, and if so, from whose side does this need to be fixed? |
I have pyparsing/pyparsing#493 open. Once that is merged I'll adjust the pinning to either |
I was working with 3.1.0 today on a new parser for work, and the exception messages are really pretty bad! I'll definitely get on this over the weekend. |
Merged your PR (and a few others, and addressing some other reported issues). Thanks for your work on this. |
matplotlib requires < 3.1 matplotlib/matplotlib#26152
PR merged and released in pyparsing 3.1.1 - let me know if there are further issues |
It appears that this is still broken, unfortunately :( I ran tests with pyparsing 3.1.1 on Gentoo through Portage, and got the following failure log: |
FYI - I've updated pyparsing's test suite to include tests from matplotlib's test suite. Just trying to stay ahead of incidents like this. |
Bug summary
Pyparsing 3.1.0 was released today and something has changed which causes quite a bit of test failures.
List of changes https://github.com/pyparsing/pyparsing/releases/tag/3.1.0 (not obvious to me what caused it).
Code for reproduction
Actual outcome
Expected outcome
No failures
Additional information
I guess a two-stage fix:
<3.1
Operating system
No response
Matplotlib Version
main-branch
Matplotlib Backend
No response
Python version
No response
Jupyter version
No response
Installation
None
The text was updated successfully, but these errors were encountered: