Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Adding a setResultsName() breaks grammar? #95

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mcondarelli opened this issue May 24, 2019 · 5 comments
Closed

Adding a setResultsName() breaks grammar? #95

mcondarelli opened this issue May 24, 2019 · 5 comments

Comments

@mcondarelli
Copy link

I'm still experimenting with grammars and I'm quite baffled.

The following example is lifted from examples/fourFn.py almost verbatim.
I deleted the "exwcuting" parseActions and (hopefully just cosmetically) changed grammar to better understand some constructs.

Nest step should be:

  • tag specific nodes with names
  • slightly restructure grammar to "unflatten" repeated operators

Even the very first change (adding a single ResultsName) breaks the grammar and I'm unable to understand what I did so wrong :(

import math

from pyparsing import *


# ORIGINAL GRAMMAR
e     = CaselessKeyword( "E" )('constant')
pi    = CaselessKeyword( "PI" )('constant')
fnumber = Regex(r"[+-]?\d+(?:\.\d*)?(?:[eE][+-]?\d+)?")('fliteral')
ident = Word(alphas, alphanums+"_$")('ident')

plus = Literal("+")
minus = Literal("-")
mult = Literal("*")
div = Literal("/")
lpar = Suppress("(")
rpar = Suppress(")")
addop = plus | minus
multop = mult | div
expop = Literal("^")

expr = Forward()('operation')
atom = (ZeroOrMore(minus('uminus')) + ( pi | e | fnumber | ident + lpar + expr + rpar | ident | lpar + expr + rpar ))
factor = Forward()
factor << atom + ZeroOrMore(expop + factor)
term = factor + ZeroOrMore(multop + factor)
expr << term + ZeroOrMore(addop + term)

bnf0 = expr

# MODIFIED GRAMMAR
e     = CaselessKeyword( "E" )('constant')
pi    = CaselessKeyword( "PI" )('constant')
fnumber = Regex(r"[+-]?\d+(?:\.\d*)?(?:[eE][+-]?\d+)?")('fliteral')
ident = Word(alphas, alphanums+"_$")('ident')

plus = Literal("+")
minus = Literal("-")
mult = Literal("*")
div = Literal("/")
lpar = Suppress("(")
rpar = Suppress(")")
addop = plus | minus
multop = mult | div
expop = Literal("^")

expr = Forward()('operation')              # This is the only difference -----vvvvvvv... for now!
atom = (ZeroOrMore(minus('uminus')) + ( pi | e | fnumber | ident + lpar + expr('arg') + rpar | ident | lpar + expr + rpar ))
factor = Forward()
factor << atom + ZeroOrMore(expop + factor)
term = factor + ZeroOrMore(multop + factor)
expr << term + ZeroOrMore(addop + term)

bnf1 = expr


def test( s, expVal ):
    print('==================')
    print(s)
    try:
        results = bnf0.parseString( s, parseAll=True )
    except ParseException as pe:
        print(s, "failed parse:", str(pe))
    except Exception as e:
        print(s, "failed eval:", str(e))
    else:
        print('------------------')
        print(results.dump())
    try:
        results = bnf1.parseString(s, parseAll=True)
    except ParseException as pe:
        print(s, "failed parse:", str(pe))
    except Exception as e:
        print(s, "failed eval:", str(e))
    else:
        print('------------------')
        print(results.dump())
    print('==================')


test( "round(PI^2)", round(math.pi**2) )
@ptmcg
Copy link
Member

ptmcg commented May 25, 2019

There are actually 2 changes here - the prior line you also added the 'operation' results name. But this change is not enough to break the grammar. What does break things is the line you indicated. This is because setResultsName is not just a simple mutator. It actually returns a copy of the given expression. This is so that a single basic expression can be used multiple times in a grammar with different attached results names. But if you make a change to the original expression after the copy was made, then the copy won't get that change.

In your case, the original was a Forward. When you created a copy by setting the results name to 'operator', that was not a breaking change for the grammar, because it was the copy that you assigned to expr. But then later when you used expr("arg'), this made another copy. Finally, when you assigned the contents to expr using expr << term + ZeroOrMore(addOp + term), this updated the original expr, but not the copy with the results name of 'arg'.

When you want to tag specific nodes and operators with names, it sounds like you want to build an AST. If so, I strongly encourage you to switch gears and look at using infixNotation instead. You can attach your AST node classes to operands using addParseAction, and attach AST classes to operations by referencing them as the fourth argument to each infixNotation operator precedence tuple.

It may be worth adding a "diagnostic" mode to pyparsing that will emit warnings during grammar construction, such as calling setResultsName on a Forward that has not yet had an expression assigned to it - not guaranteed to cause a problem, but a definite warning flag.

@mcondarelli
Copy link
Author

Thanks for Your answer.

Can You, please, explain (or point me to an explanation) why all this copying is necessary and the rationale behind it?
I have been bitten twice already by this and it's definitely something I'm not grokking.
The only thing I seem to understand is You need different ParserElement instances if and when you need to "specialize" a defined one when matched in a specific place.

In that case I think it would be useful to keep track of copies, somehow and, especially in the case of Forward, propagate changes, notably << operator. This latter could be done with the following trivial patch which "cures" error in my grammar:

diff --git a/pyparsing.py b/pyparsing.py
index 0fc3917..585a378 100644
--- a/pyparsing.py
+++ b/pyparsing.py
@@ -4745,6 +4745,7 @@ class Forward(ParseElementEnhance):
     """
     def __init__( self, other=None ):
         super(Forward,self).__init__( other, savelist=False )
+        self.copies = []
 
     def __lshift__( self, other ):
         if isinstance( other, basestring ):
@@ -4757,6 +4758,8 @@ class Forward(ParseElementEnhance):
         self.skipWhitespace = self.expr.skipWhitespace
         self.saveAsList = self.expr.saveAsList
         self.ignoreExprs.extend(self.expr.ignoreExprs)
+        for c in self.copies:
+            c << other
         return self
 
     def __ilshift__(self, other):
@@ -4799,11 +4802,13 @@ class Forward(ParseElementEnhance):
 
     def copy(self):
         if self.expr is not None:
-            return super(Forward,self).copy()
+            ret = super(Forward,self).copy()    # todo: check if this is really necessary
+            ret.copies = []                     # todo: double assignments could be flagged as errors
         else:
             ret = Forward()
             ret <<= self
-            return ret
+        self.copies.append(ret)
+        return ret
 
 class TokenConverter(ParseElementEnhance):
     """

About the "diagnostic mode": is it something already existing? I just found some reference in pyparsing.ParseException.explain(), but it's unclear if this is what You are referring to.

@ptmcg
Copy link
Member

ptmcg commented May 27, 2019

No the diagnostic mode does not yet exist. If you would open an issue for it, I'll use that to start attaching notes on what would make for good warnings. (Note; I am thinking this is specifically for diagnostics while creating the parser, not while running it. That might change if we come up with some good parse-time warnings, but for parse time we already have ParseExceptions.)

The copying is done so that an expression can be used multiple times in the same parser with different results names. Each instance with a different name is implemented with a copy of the given expression. I've answered this question before in more detail, I'll try to track down that discussion thread.

@ptmcg
Copy link
Member

ptmcg commented Jul 9, 2019

The next release will include a __diag__ namespace for enabling various diagnostic and debugging warnings at parser definition time. Setting __diag__.warn_name_set_on_empty_Forward will generate a user warning if defining a Forward with a results name if the Forward has no expression defined for it.

@ptmcg
Copy link
Member

ptmcg commented Jul 21, 2019

New __diag__ switches shipped in 2.4.1, released today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants