Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing arithmetic expressions with function calls

I am working with pyparsing and found it to be excellent for developing a simple DSL that allows me to extract data fields out of MongoDB and do simple arithmetic operations on them. I am now trying to extend my tools such that I can apply functions of the form Rank[Person:Height] to the fields and potentially include simple expressions as arguments to the function calls. I am struggling hard with getting the parsing syntax to work. Here is what I have so far:

# Define parser
expr = Forward()
integer = Word(nums).setParseAction(EvalConstant)
real = Combine(Word(nums) + "." + Word(nums)).setParseAction(EvalConstant)

# Handle database field references that are coming out of Mongo, 
# accounting for the fact that some fields contain whitespace
dbRef = Combine(Word(alphas) + ":" + Word(printables) + \
    Optional(" " + Word(alphas) + " " + Word(alphas)))
dbRef.setParseAction(EvalDBref)

# Handle function calls
functionCall = (Keyword("Rank") | Keyword("ZS") | Keyword("Ntile")) + "[" + expr + "]"
functionCall.setParseAction(EvalFunction)
operand =  functionCall | dbRef | (real | integer) 

signop = oneOf('+ -')
multop = oneOf('* /')
plusop = oneOf('+ -')

# Use parse actions to attach Eval constructors to sub-expressions
expr << operatorPrecedence(operand,
    [
     (signop, 1, opAssoc.RIGHT, EvalSignOp),
     (multop, 2, opAssoc.LEFT, EvalMultOp),
     (plusop, 2, opAssoc.LEFT, EvalAddOp),
    ])

My issue is that when I test a simple expression like Rank[Person:Height] I am getting a parse exception:

ParseException: Expected "]" (at char 19), (line:1, col:20)

If I use a float or arithmetic expression as the argument like Rank[3 + 1.1] the parsing works ok, and if I simplify the dbRef grammar so its just Word(alphas) it also works. Cannot for the life of me figure out whats wrong with my full grammar. I have tried rearranging the order of operands as well as simplifying the functionCall grammar to no avail. Can anyone see what I am doing wrong?

Once I get this working I would want to take a last step and introduce support for variable assignment in expressions ..

EDIT: Upon further testing, if I remove the printables from dbRef grammar, things work ok:

 dbRef = Combine(Word(alphas) + OneOrMore(":") + Word(alphanums) + \
      Optional("_" + Word(alphas)))

HOWEVER, if I add the character "-" to dbRef (which I need for DB fields like "Class:S-N"), the parser fails again. I think the "-" is being consumed by the signop in my operatorPrecedence?

like image 849
Roger Sanchez Avatar asked Dec 11 '25 01:12

Roger Sanchez


1 Answers

What appears to happen is that the ] character at the end of your test string (Rank[Person:Height]) gets consumed as part of the dbRef token, because the portion of this token past the initial : is declared as being made of Word(printables) (and this character set, unfortunately includes the square brackets characters)

Then the parser tries to produce a functionCall but is missing the closing ] hence the error message.

A tentative fix is to use a character set that doesn't include the square brackets, maybe something more explicit like:

dbRef = Combine(Word(alphas) + ":" + Word(alphas, alphas+"-_./") + \
    Optional(" " + Word(alphas) + " " + Word(alphas)))

Edit:
Upon closer look, the above is loosely correct, but the token hierarchy is wrong (e.g. the parser attempts to produce a functionCall as one operand of an an expr etc.)
Also, my suggested fix will not work because of the ambiguity with the - sign which should be understood as a plain character when within a dbRef and as a plusOp when within an expr. This type of issue is common with parsers and there are ways to deal with this, though I'm not sure exactly how with pyparsing.

like image 190
mjv Avatar answered Dec 12 '25 15:12

mjv



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!