-
-
Notifications
You must be signed in to change notification settings - Fork 290
match_previous_expr does not handle nested expressions #560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've tried some simpler expression grammars and even this doesn't work for me: import pyparsing as pp
LBRACE, RBRACE = map(pp.Suppress, "()")
NUMBER = pp.Word(pp.nums, pp.nums + ".")
MATH_TEXT = pp.Word("+-*/ ")
expression = pp.Forward()
GROUP = LBRACE + expression + RBRACE
expression = pp.OneOrMore(GROUP | NUMBER | MATH_TEXT)
print(expression.parse_string("5*(2+3)", parse_all=True).dump(" ") This fails with an error:
|
|
@ptmcg Right, thanks for pointing out! Basic nesting does work alright then. |
Pyparsing supports some warnings to help in catching stuff like that. You have to enable them with -W switch or in code - see notes here: https://github.com/pyparsing/pyparsing/wiki/Parser-Debugging-and-Diagnostics#diagnostic-switches |
Pyparsing has a helper method called |
Here is your parser with a nested-capable match_previous_expr() method. I also converted your test code to use import pyparsing as pp
def match_previous_expr_nested(expr: pp.ParserElement) -> pp.ParserElement:
rep = pp.Forward()
e2 = expr.copy()
rep <<= e2
rep.match_stack = []
def copy_token_to_repeater(s, l, t):
rep.match_stack.append(pp.helpers._flatten(t.as_list()))
def must_match_these_tokens(s, l, t):
these_tokens = pp.helpers._flatten(t.as_list())
match_tokens = rep.match_stack[-1]
if these_tokens != match_tokens:
if these_tokens in rep.match_stack:
rep.match_stack.pop()
if len(match_tokens) == 1:
error_msg = f"Expected {str(match_tokens[0])!r}"
else:
error_msg = f"Expected {match_tokens}"
raise pp.ParseException(s, l, error_msg)
rep.match_stack.pop()
rep.set_parse_action(must_match_these_tokens, callDuringTry=True)
expr.add_parse_action(copy_token_to_repeater, callDuringTry=True)
rep.set_name("(prev) " + str(expr))
return rep
pp.ParserElement.set_default_whitespace_chars("")
LBRACE, RBRACE, SLASH = map(pp.Suppress, "{}/")
IDENTIFIER_CHARS = pp.alphanums + "_"
TAG_NAME = pp.Word(IDENTIFIER_CHARS)
# Tags
OPEN_TAG = LBRACE + (OPEN_TAG_NAME := TAG_NAME("open_tag")) + RBRACE
# CLOSE_TAG = LBRACE + SLASH + TAG_NAME("close_tag") + RBRACE
CLOSE_TAG = LBRACE + SLASH + match_previous_expr_nested(OPEN_TAG_NAME) + RBRACE
# Forward declaring content due to its recursivity
content = pp.Forward()
# Main elements
TAGGED_CONTENT = pp.Group(OPEN_TAG + pp.Group(content) + CLOSE_TAG)("tagged_content*")
PLAIN_TEXT = pp.Group(pp.CharsNotIn("{}"))("plain_text*")
STANDALONE_TAG = pp.Group(LBRACE + TAG_NAME + RBRACE)("standalone_tag*")
# Recursive definition of content
content <<= pp.ZeroOrMore(TAGGED_CONTENT | STANDALONE_TAG | PLAIN_TEXT)
# pp.autoname_elements()
# OPEN_TAG.set_debug()
# CLOSE_TAG.set_debug()
# TAGGED_CONTENT.set_debug()
# PLAIN_TEXT.set_debug()
# STANDALONE_TAG.set_debug()
# content.create_diagram(f"{__file__.removeprefix('.py')}.html")
content.run_tests(
"""\
{tag}Tagged content{/tag} plain text {standalone}{tag}Tagged again!{/tag}
{tag}Tagged content{/tag} plain text {standalone}{tag}Tagged again!{/tag}{/standalone}
{t1}aaa{t2}bbb{/t2}{/t1}
{t1}aaa{t2}bbb{/t1}
{t1}aaa{t2}{/t2}bbb{/t1}
{t1}{t2}{/t1}
# error case - mismatched closing tag
{t1}{t2}{/not_t1}
""",
full_dump=False,
) |
@ptmcg Hello! Sorry for the long delay, was working on other tasks. By the way, thanks for your work and attention to the project, pyparsing rocks, prototyping speed is very nice indeed :) Now to the matter: yes, this looks nice and seems to work as intended. I even mananged to make a shorthand tag notation work (e.g. LBRACE, RBRACE, SLASH = map(pp.Suppress, "{}/")
IDENTIFIER_CHARS = pp.alphanums + "_"
TAG_NAME_START = pp.Char(IDENTIFIER_CHARS)
TAG_NAME = pp.Word(IDENTIFIER_CHARS)
# Tags
OPEN_TAG = (
LBRACE
+ pp.Group(pp.original_text_for((OPEN_TAG_NAME_START := TAG_NAME_START) + pp.Opt(TAG_NAME)))(
"open_tag"
)
+ RBRACE
)
CLOSE_TAG = (
LBRACE
+ SLASH
+ pp.Group(
pp.original_text_for(match_previous_expr_nested(OPEN_TAG_NAME_START) + pp.Opt(TAG_NAME))
)("close_tag")
+ RBRACE
)
# Forward declaring content due to its recursivity
content = pp.Forward() |
See this SO post, should be able to use match_previous_expr to enforce matching of opening and closing tags. Does not work for nested tags.
The text was updated successfully, but these errors were encountered: