Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What should the correct grammar be for correct precedence evaluation of +,-,/,*, etc

Tags:

antlr4

My grammar has these rules

 expression
: expression EQ conditionalOrExpression                     #eqExpr
 | expression NEQ conditionalOrExpression                   #neqExpr
 | expression LT conditionalOrExpression                    #ltExpr
 | expression GT conditionalOrExpression                    #gtExpr
 | expression LTEQ conditionalOrExpression                  #lteqExpr
 | expression GTEQ conditionalOrExpression                  #gteqExpr
 | conditionalOrExpression                                  #next       
;

conditionalOrExpression
 : conditionalOrExpression OR conditionalAndExpression      #orExpr
 | conditionalAndExpression                                 #and
 ;

conditionalAndExpression
 : conditionalAndExpression AND additiveExpression          #andExpr
 | additiveExpression                                       #add
 ;

additiveExpression
 : additiveExpression PLUS multiplicativeExpression         #plusExpr
 | additiveExpression MINUS multiplicativeExpression        #minusExpr
 | multiplicativeExpression                                 #multiplicative
 ;

multiplicativeExpression
 : multiplicativeExpression MULT unaryExpression            #multExpr
 | multiplicativeExpression DIV unaryExpression             #divExpr
 | unaryExpression                                          #unary
 ;

unaryExpression
 : MINUS unaryExpression                                    #unaryMinusExpr
 | NOT unaryExpression                                      #notExpr
 | atom                                                     #atomExpr
 ;

function
: ID OPAR (parameter (',' parameter)*)? CPAR
;

parameter
: STRING                                                    #stringParameter
| expression                                                #exprParameter
;

atom
 : OPAR expression CPAR                                     #parExpr
 | (INT | FLOAT)                                            #numberAtom
 | (TRUE | FALSE)                                           #booleanAtom
 | ID                                                       #idAtom
 | function                                                 #functionAtom
 ;

I have implemented the appropriate Visitor.

If I evaluate "40 + 10 - (2*40) + (100/40) + 0.2" the result is -32.7. This is because the expression gets evaluated as

(40+10) - (((2*40) + (100/40)) + 0.2) 

which makes sense according to the rules (PLUS before MINUS).

However, if I evaluate the same expression in Excel or e.g. assign it to a double in C#, in both cases the result is -27.3. This is because they evaluate the rule as

(((40+10)-(2*40)) + (100/40)) + 0.2

So which is "correct"? -32.7 is technically correct since that's what the rules dictate. But how should the grammar change to match the results in Excel/C#?

like image 626
XBond Avatar asked Aug 20 '14 20:08

XBond


2 Answers

If you group the + and - in a single alternative, it should work. I changed the grammar in my demo evaluator on GitHub as follows:

expr
 : expr POW<assoc=right> expr           #powExpr
 | MINUS expr                           #unaryMinusExpr
 | NOT expr                             #notExpr
 | expr op=(MULT | DIV | MOD) expr      #multiplicationExpr
 | expr op=(PLUS | MINUS) expr          #additiveExpr
 | expr op=(LTEQ | GTEQ | LT | GT) expr #relationalExpr
 | expr op=(EQ | NEQ) expr              #equalityExpr
 | expr AND expr                        #andExpr
 | expr OR expr                         #orExpr
 | atom                                 #atomExpr
 ;

and then tested with the following code:

MuLexer lexer = new MuLexer(new ANTLRInputStream("log(40 + 10 - (2*40) + (100/40) + 0.2);"));
MuParser parser = new MuParser(new CommonTokenStream(lexer));
new EvalVisitor().visit(parser.parse());

the value -27.3 gets printed on my console.

like image 183
Bart Kiers Avatar answered Oct 22 '22 00:10

Bart Kiers


To get the order of operations correct on expressions like this you'll want to set up nested rules. Parse the higher precedence operators deeper in, closer to the primary expressions. For an example, I'd point you to a syntax I wrote for Jison, a similar grammar tool. The code can be seen here.

You'll break the one expr expression into several smaller expressions for each "level" of operator precedence, such as:

expr
  : comp_expr
  | expr0
  ;

expr0
  : or_expr
  | expr1
  ;

expr1
  : and_expr
  | expr2
  ;

expr2
  : add_expr
  | expr3
  ;

expr3
  : mult_expr
  | expr4
  ;

expr4
  : exp_expr
  | primary
  ;
like image 35
couchand Avatar answered Oct 21 '22 22:10

couchand