Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boolean and arithmetic expression grammar in ANTLR

I'm trying to write a grammar for arithmetic and boolean expressions. I don't understand what I'm doing wrong. For my grammar, ANTLR says:

[fatal] rule logic_atom has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.

But I can't do a left-factoring. And I don't want to touch arith_expr, because for this I have a code.

Error in logic_atom : LBR logic_expr RBR | cmp_expr ;

My code:

grammar ArithmeticInterpreter;

options { 
    output = AST;
    language = C;
}
//options{greedy=true;}:

axiom : lines EOF! ;
lines : line (SEP! line)* ;
line  : (def_var | print_expr | scan_expr)? ;

def_var    : VARIABLE ASSIGMENT^ logic_expr ;
print_expr : PRINT_KEYW^ arith_expr ;
scan_expr  : SCAN_KEYW^ VARIABLE ;

arith_expr : ((PLS | MNS)^)? term ((PLS | MNS)^ term)*;
term       : power ((MLP | DIV)^ power )*;
power      : atom  (options{greedy=true;}: PWR^ power )*;
atom       : INT | FLOAT | VARIABLE | LBR arith_expr RBR -> ^(arith_expr);

logic_expr    : logic_atom ((OR | AND)^ logic_atom)*;
logic_atom :   LBR logic_expr  RBR |  cmp_expr  ;
cmp_expr: arith_expr (LSS | LSQ | GRT | GRQ | EQL | NEQ) arith_expr;

WS  : ( ' '| '\t'| '\r') {$channel=HIDDEN;};

LBR :  '(' ;
RBR :  ')' ;
PLS :  '+' ;
MNS :  '-' ;
MLP :  '*' ;
DIV :  '/' ;
PWR :  '^' ;

LSS :  '<'  ;
LSQ :  '<=' ;
GRT :  '>'  ;
GRQ :  '>=' ;
EQL :  '==' ;
NEQ :  '!=' ;
AND :  '&&' ;
OR  :  '||' ;
NOT :  '!'  ;

ASSIGMENT : '=' ;
PRINT_KEYW : 'print' ;
SCAN_KEYW  : 'scan' ;

SEP : '\n' | ';' ;

INT :  ('0'..'9')+;

FLOAT : INT '.' INT* EXP? | '.' INT EXP? | INT EXP;
fragment EXP : ('e'|'E') (PLS | MNS)? INT;

VARIABLE : SS (SS | '0'..'9')* ;
fragment SS : 'a'..'z' | 'A'..'Z' | '_' ;

// (LBR arith_expr)=> isn't work.

like image 486
Alexander Lavrukov Avatar asked Dec 13 '12 20:12

Alexander Lavrukov


2 Answers

Consider changing your logic_expr and cmp_expr to this:

logic_expr : cmp_expr ((OR | AND)^ cmp_expr)*;
cmp_expr   : (arith_expr (LSS | LSQ | GRT | GRQ | EQL | NEQ))=> arith_expr (LSS | LSQ | GRT | GRQ | EQL | NEQ)^ arith_expr
           | LBR logic_expr RBR -> logic_expr
           ;

I removed the rule logic_atom because it obscures the error that you're getting and doesn't add value.

By using the syntactic predicate in cmp_expr, you're able to tell ANTLR that any arith_expr followed by a logic sign will only be followed by an arith_expr, which means that any parentheses that ANTLR encounters must belong to an arithmetic expression and not a logical one.

This ensures that logic_expr only deals with boolean values and arith_expr only deals with numeric values.


I tested various scenarios with the modified grammar and I'm not getting errors in ANTLRWorks or in my custom test code. Could you post more information about what you're seeing?

Here is the full grammar I'm using. Note that I removed the language so that I could test this in Java. This should be fine since there are no actions/semantic predicates. I also made a few small changes, but I don't expect them to be serious fixes. They're denoted with comments.

grammar ArithmeticInterpreter;

options { 
    output = AST;
}
//options{greedy=true;}:

axiom : lines EOF! ;
lines : line (SEP! line)* ;
line  : (def_var | print_expr | scan_expr)? ;

def_var    : VARIABLE ASSIGMENT^ logic_expr ;
print_expr : PRINT_KEYW^ arith_expr ;
scan_expr  : SCAN_KEYW^ VARIABLE ;

arith_expr : ((PLS | MNS)^)? term ((PLS | MNS)^ term)*;
term       : power ((MLP | DIV)^ power )*;
power      : atom  (PWR^ atom)*;  //<-- changed
atom       : INT | FLOAT | VARIABLE 
           | LBR arith_expr RBR -> arith_expr //<-- changed
           ;

logic_expr : cmp_expr ((OR | AND)^ cmp_expr)*;
cmp_expr   : (arith_expr (LSS | LSQ | GRT | GRQ | EQL | NEQ))=> arith_expr (LSS | LSQ | GRT | GRQ | EQL | NEQ)^ arith_expr
           | LBR logic_expr RBR -> logic_expr
           ;

WS  : ( ' '| '\t'| '\r') {$channel=HIDDEN;};

LBR :  '(' ;
RBR :  ')' ;
PLS :  '+' ;
MNS :  '-' ;
MLP :  '*' ;
DIV :  '/' ;
PWR :  '^' ;

LSS :  '<'  ;
LSQ :  '<=' ;
GRT :  '>'  ;
GRQ :  '>=' ;
EQL :  '==' ;
NEQ :  '!=' ;
AND :  '&&' ;
OR  :  '||' ;
NOT :  '!'  ;

ASSIGMENT : '=' ;
PRINT_KEYW : 'print' ;
SCAN_KEYW  : 'scan' ;

SEP : '\n' | ';' ;

INT :  ('0'..'9')+;

FLOAT : INT '.' INT* EXP? | '.' INT EXP? | INT EXP;
fragment EXP : ('e'|'E') (PLS | MNS)? INT;

VARIABLE : SS (SS | '0'..'9')* ;
fragment SS : 'a'..'z' | 'A'..'Z' | '_' ;

Given input x=(2<3), the following AST tree is produced:

(= x (< 2 3))

Which renders like so:

(= x (< 2 3))

The modified grammar can also handle more complex cases now, like x = 2 + 3 < 4 || (5 ^ 5 > 30 && 3 == 10 + 2):

(= x (|| (< (+ 2 3) 4) (&& (> (^ 5 5) 30) (== 3 (+ 10 2)))))

a more complex graph

So try copying the grammar above and seeing if that fixes the error you get. If not, let me know more about the error you're seeing.

like image 100
user1201210 Avatar answered Sep 24 '22 13:09

user1201210


My quick suggestion is to combine arith and logical expressions. See any sample grammar such as Java.g or whatever. ANTLR v4 would handle this no problem, btw.

like image 22
Terence Parr Avatar answered Sep 22 '22 13:09

Terence Parr