Im unable to translate this EBNF expression into Pyparsing, any idea?
token:: [A-Z]
P:: !|token;P|(P^P)|(P*P)
The problem is when using recursion, the interpreter fails. Expression like this should be valid:
(ASD;!^FFF;!)
A;B;C;!
(((A;!^B;!)^C;D;!)*E;!)
To build a recursive grammar with Pyparsing, you have to think a little inside-out, using pyparsing's Forward class. With Forward, you define an empty placeholder for an expression to be defined later. Here is a start at pyparsing for this BNF:
EXCLAM,SEMI,HAT,STAR = map(Literal,"!;^*")
LPAR,RPAR = map(Suppress,"()")
token = oneOf(list(alphas.upper()))
I'm using Literal for defining your operators, but suppressing the grouping ()'s, we'll use pyparsing Group to physically group the results into sublists.
Now we define the placeholder expression with Forward:
expr = Forward()
And now we can build the expression using this placeholder (we have to use '<<=' as the assignment operator so that expr is maintained as a Forward, and not rebound to the expression itself). Here is my first pass, using your BNF as-is:
expr <<= (EXCLAM |
token + SEMI + expr |
Group(LPAR + expr + HAT + expr + RPAR) |
Group(LPAR + expr + STAR + expr + RPAR))
This gives these results:
(ASD;!^FFF;!)
^
Expected ";" (at char 2), (line:1, col:3)
A;B;C;!
['A', ';', 'B', ';', 'C', ';', '!']
(((A;!^B;!)^C;D;!)*E;!)
[[[['A', ';', '!', '^', 'B', ';', '!'], '^', 'C', ';', 'D', ';', '!'], '*', 'E', ';', '!']]
It seems there is an unwritten rule in your BNF, that one or more tokens together can be present also, easily fixed as:
expr <<= (EXCLAM |
OneOrMore(token) + SEMI + expr |
Group(LPAR + expr + HAT + expr + RPAR) |
Group(LPAR + expr + STAR + expr + RPAR))
Now giving:
(ASD;!^FFF;!)
[['A', 'S', 'D', ';', '!', '^', 'F', 'F', 'F', ';', '!']]
A;B;C;!
['A', ';', 'B', ';', 'C', ';', '!']
(((A;!^B;!)^C;D;!)*E;!)
[[[['A', ';', '!', '^', 'B', ';', '!'], '^', 'C', ';', 'D', ';', '!'], '*', 'E', ';', '!']]
But it looks like we could benefit from additional grouping, so that the operands for the binary '^' and '*' operators are more clearly grouped. So I settled on:
expr <<= (EXCLAM |
Group(OneOrMore(token) + SEMI + ungroup(expr)) |
Group(LPAR + expr + HAT + expr + RPAR) |
Group(LPAR + expr + STAR + expr + RPAR) )
And I think this version of the output will be more easily processed now:
(ASD;!^FFF;!)
[[['A', 'S', 'D', ';', '!'], '^', ['F', 'F', 'F', ';', '!']]]
A;B;C;!
[['A', ';', 'B', ';', 'C', ';', '!']]
(((A;!^B;!)^C;D;!)*E;!)
[[[[['A', ';', '!'], '^', ['B', ';', '!']], '^', ['C', ';', 'D', ';', '!']], '*', ['E', ';', '!']]]
Here is the complete script:
from pyparsing import *
EXCLAM,SEMI,HAT,STAR = map(Literal,"!;^*")
LPAR,RPAR = map(Suppress,"()")
token = oneOf(list(alphas.upper()))
expr = Forward()
expr <<= (EXCLAM |
Group(OneOrMore(token) + SEMI + ungroup(expr)) |
Group(LPAR + expr + HAT + expr + RPAR) |
Group(LPAR + expr + STAR + expr + RPAR) )
tests = """\
(ASD;!^FFF;!)
A;B;C;!
(((A;!^B;!)^C;D;!)*E;!)""".splitlines()
for t in tests:
print t
try:
print expr.parseString(t).dump()
except ParseException as pe:
print ' '*pe.loc + '^'
print pe
print
Last note: I assumed that "AAA" was 3 successive 'A' tokens. If you meant for tokens to be word groupings of 1 or more alphas, then change 'OneOrMore(token)' in the expression to 'Word(alphas.upper())' - then you'll get this result for your first test case:
[[['ASD', ';', '!'], '^', ['FFF', ';', '!']]]
This make the Lisp notation work xD !!
from pyparsing import *
def pushFirst( strg, loc, toks ):
toks[0][2], toks[0][1] = toks[0][1], toks[0][2]
def parseTerm(term):
"""
EBNF syntax elements
EXCLAM = !
HAT = ^
STAR = *
SEMI = ;
LPAR = (
RPAR = )
"""
EXCLAM,HAT,STAR = map(Literal,"!^*")
LPAR,RPAR = map(Suppress,"()")
SEMI = Suppress(";")
token = oneOf(list(alphas.upper()))
expr = Forward()
expr <<= (
EXCLAM |
Group(Word(alphas.upper()) + SEMI + ungroup(expr)) |
Group(LPAR + expr + HAT + expr + RPAR).setParseAction( pushFirst ) |
Group(LPAR + expr + STAR + expr + RPAR).setParseAction( pushFirst )
)
try:
result = expr.parseString(term)
except ParseException as pe:
print ' '*pe.loc + '^'
print pe
return result[0]
def computeTerm(term):
print term
term = (parseTerm("(((AXX;!^B;!)^C;D;!)*E;!)"))
computeTerm(term)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With