I am a complete ANTLR4 newbie, so please forgive my ignorance. I ran into this presentation where a very simple arithmetic expression grammar is defined. It looks like:
grammar Expressions; start : expr ; expr : left=expr op=('*'|'/') right=expr #opExpr | left=expr op=('+'|'-') right=expr #opExpr | atom=INT #atomExpr ; INT : ('0'..'9')+ ; WS : [ \t\r\n]+ -> skip ;
Which is great because it will generate a very simple binary tree that can be traversed using the visitor pattern as explained in the slides, e.g., here's the function that visits the expr
:
public Integer visitOpExpr(OpExprContext ctx) { int left = visit(ctx.left); int right = visit(ctx.right); String op = ctx.op.getText(); switch (op.charAt(0)) { case '*': return left * right; case '/': return left / right; case '+': return left + right; case '-': return left - right; default: throw new IllegalArgumentException("Unkown opeator " + op); } }
The next thing I would like to add is support for parentheses. So I modified the expr
as follows:
expr : '(' expr ')' #opExpr | left=expr op=('*'|'/') right=expr #opExpr | left=expr op=('+'|'-') right=expr #opExpr | atom=INT #atomExpr ;
Unfortunately, the code above fails because when encountering parentheses the three attributes op
,left
and right
are null (fails with NPE).
I think I could work around that by defining a new attribute, e.g., parenthesized='(' expr ')'
, and then deal with that in the visitor code. However, it seems overkill to me to have a whole extra node type to represent an expression in parentheses. A simpler but uglier solution is to add the following line of code at the beginning of the visitOpExpr
method:
if (ctx.op == null) return visit(ctx.getChild(1)); // 0 and 2 are the parentheses!
I don't like the above at all because it's very fragile and highly dependent on the grammar structure.
I am wondering if there is a way to tell ANTLR to just "eat" the parentheses and treat the expression like a child. Is there? Is there a better way to do this?
Note: My end goal is to extend the example to include boolean expressions that can themselves contain arithmetic expressions, e.g., (2+4*3)/10 >= 11
, that is, a relation (<,>,==,~=,etc.) between arithmetic expressions can define an atomic boolean expression. This is straight forward and I already have the grammar sketched out but I have the same problem with parenthesis, i.e., I need to be able to write stuff like (I will also add support for variables):
((2+4*x)/10 >= 11) | ( x>1 & x<3 )
EDIT: Fixed the precedence of the parenthesized expression, parenthesis always have higher precedence.
The Antlr4 application generates a lexer from a set of lexer rules, as we'll see in our expression grammar momentarily. We specify rules; Antlr4 generates a lexer from them. Rules resemble assignment statements where names begin with uppercase alpha.
ANTLR Visitors The difference between listener and visitor mechanisms is listener methods are called by the ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls. Forgetting to invoke visit() on a node's children means those subtrees don't get visited.
ANTLR helps you build intermediate form trees, or abstract syntax trees (ASTs), by providing grammar annotations that indicate what tokens are to be treated as subtree roots, which are to be leaves, and which are to be ignored with respect to tree construction.
What you need to do to get a parse tree: define a lexer and parser grammar. invoke ANTLR: it will generate a lexer and a parser in your target language (e.g., Java, Python, C#, JavaScript) use the generated lexer and parser: you invoke them passing the code to recognize and they return to you a parse tree.
Sure, just label it differently. After all, the alternative '(' expr ')'
isn't a #opExpr
:
expr : left=expr op=('*'|'/') right=expr #opExpr | left=expr op=('+'|'-') right=expr #opExpr | '(' expr ')' #parenExpr | atom=INT #atomExpr ;
And in your visitor, you'd do something like this:
public class EvalVisitor extends ExpressionsBaseVisitor<Integer> { @Override public Integer visitOpExpr(@NotNull ExpressionsParser.OpExprContext ctx) { int left = visit(ctx.left); int right = visit(ctx.right); String op = ctx.op.getText(); switch (op.charAt(0)) { case '*': return left * right; case '/': return left / right; case '+': return left + right; case '-': return left - right; default: throw new IllegalArgumentException("Unknown operator " + op); } } @Override public Integer visitStart(@NotNull ExpressionsParser.StartContext ctx) { return this.visit(ctx.expr()); } @Override public Integer visitAtomExpr(@NotNull ExpressionsParser.AtomExprContext ctx) { return Integer.valueOf(ctx.getText()); } @Override public Integer visitParenExpr(@NotNull ExpressionsParser.ParenExprContext ctx) { return this.visit(ctx.expr()); } public static void main(String[] args) { String expression = "2 * (3 + 4)"; ExpressionsLexer lexer = new ExpressionsLexer(CharStreams.fromString(expression)); ExpressionsParser parser = new ExpressionsParser(new CommonTokenStream(lexer)); ParseTree tree = parser.start(); Integer answer = new EvalVisitor().visit(tree); System.out.printf("%s = %s\n", expression, answer); } }
If you run the class above, you'd see the following output:
2 * (3 + 4) = 14
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With