I'm creating a simple programming language for a school project. I'm using ANTLR 4 to generate a lexer and a parser from my grammar. Until now, I have been using ANTLRs listener pattern to apply the actual functionality of the programming language.
Now I would like to implement if/else statements but I'm not sure that these can actually be implemented when using the listener pattern as ANTLR decides in which order to traverse the parse tree when using listeners and I imagine that the implementation of if/else statements will require jumping around the parse tree depending on which condition in the statement is satisfied.
Can anyone tell me if it will be possible to implement if/else statements using ANTLR or if I will have to implement the visitor pattern myself? Also, can anyone give an extremely simple example of the implementation the statements?
Antlr - Parse Tree Listener The parse tree listener (or listener) is a class that implements callback methods that are called by the parser when it creates the parse tree. You can overwrite this class to get information when the parser enter or exit a rule (ie as found a pattern)
I have created a small Java project that allows you to test your ANTLR grammar instantly by compiling the lexer and parser generated by ANTLR in-memory. You can just parse a string by passing it to the parser, and it will automatically generate an AST from it which can then be used in your application.
In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development.
ANTLR Visitors The difference between listener and visitor mechanisms is listener methods are called by the ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls. Forgetting to invoke visit() on a node's children means those subtrees don't get visited.
By default, ANTLR 4 generates listeners. But if you give org.antlr.v4.Tool
the command line parameter -visitor
, ANTLR generates visitor classes for you. These work much like listeners, but give you more control over which (sub) trees are walked/visited. This is particularly useful if you want to exclude certain (sub) trees (like else/if blocks, as in your case). While this can be done using listeners, it's much cleaner to do this with a visitor. Using listeners, you'll need to introduce global variables that keep track if a (sub) tree needs to be evaluated, and which do not.
As it happens to be, I'm working on a small ANTLR 4 tutorial. It's not done yet, but I'll post a small working example that demonstrates the use of these visitor classes and an if
statement construct.
Here's a simple grammar supporting basic expressions, if
-, while
- and log
-statements:
grammar Mu; parse : block EOF ; block : stat* ; stat : assignment | if_stat | while_stat | log | OTHER {System.err.println("unknown char: " + $OTHER.text);} ; assignment : ID ASSIGN expr SCOL ; if_stat : IF condition_block (ELSE IF condition_block)* (ELSE stat_block)? ; condition_block : expr stat_block ; stat_block : OBRACE block CBRACE | stat ; while_stat : WHILE expr stat_block ; log : LOG expr SCOL ; expr : expr POW<assoc=right> expr #powExpr | MINUS expr #unaryMinusExpr | NOT expr #notExpr | expr op=(MULT | DIV | MOD) expr #multiplicationExpr | expr op=(PLUS | MINUS) expr #additiveExpr | expr op=(LTEQ | GTEQ | LT | GT) expr #relationalExpr | expr op=(EQ | NEQ) expr #equalityExpr | expr AND expr #andExpr | expr OR expr #orExpr | atom #atomExpr ; atom : OPAR expr CPAR #parExpr | (INT | FLOAT) #numberAtom | (TRUE | FALSE) #booleanAtom | ID #idAtom | STRING #stringAtom | NIL #nilAtom ; OR : '||'; AND : '&&'; EQ : '=='; NEQ : '!='; GT : '>'; LT : '<'; GTEQ : '>='; LTEQ : '<='; PLUS : '+'; MINUS : '-'; MULT : '*'; DIV : '/'; MOD : '%'; POW : '^'; NOT : '!'; SCOL : ';'; ASSIGN : '='; OPAR : '('; CPAR : ')'; OBRACE : '{'; CBRACE : '}'; TRUE : 'true'; FALSE : 'false'; NIL : 'nil'; IF : 'if'; ELSE : 'else'; WHILE : 'while'; LOG : 'log'; ID : [a-zA-Z_] [a-zA-Z_0-9]* ; INT : [0-9]+ ; FLOAT : [0-9]+ '.' [0-9]* | '.' [0-9]+ ; STRING : '"' (~["\r\n] | '""')* '"' ; COMMENT : '#' ~[\r\n]* -> skip ; SPACE : [ \t\r\n] -> skip ; OTHER : . ;
Now let's say you would like to parse, and evaluate, input like this:
a = true; b = false; if a && b { log "1 :: a=" + a +", b=" + b; } else if a || b { log "2 :: a=" + a +", b=" + b; } else { log "3 :: a=" + a +", b=" + b; } log "Done!";
Start by generating the parser and visitor classes:
java -cp antlr-4.0-complete.jar org.antlr.v4.Tool Mu.g4 -visitor
The command above would have generated, among others the file MuBaseVisitor<T>
. This is the class we're going to extend with out own logic:
public class EvalVisitor extends MuBaseVisitor<Value> { // ... }
where Value
is just a wrapper for any of our language's types (String
, Boolean
, Double
):
public class Value { public static Value VOID = new Value(new Object()); final Object value; public Value(Object value) { this.value = value; } public Boolean asBoolean() { return (Boolean)value; } public Double asDouble() { return (Double)value; } public String asString() { return String.valueOf(value); } public boolean isDouble() { return value instanceof Double; } @Override public int hashCode() { if(value == null) { return 0; } return this.value.hashCode(); } @Override public boolean equals(Object o) { if(value == o) { return true; } if(value == null || o == null || o.getClass() != this.getClass()) { return false; } Value that = (Value)o; return this.value.equals(that.value); } @Override public String toString() { return String.valueOf(value); } }
To test the classes, use the following Main
class:
import org.antlr.v4.runtime.ANTLRFileStream; import org.antlr.v4.runtime.CommonTokenStream; import org.antlr.v4.runtime.tree.ParseTree; public class Main { public static void main(String[] args) throws Exception { MuLexer lexer = new MuLexer(new ANTLRFileStream("test.mu")); MuParser parser = new MuParser(new CommonTokenStream(lexer)); ParseTree tree = parser.parse(); EvalVisitor visitor = new EvalVisitor(); visitor.visit(tree); } }
and compile and run the source files:
javac -cp antlr-4.0-complete.jar *.java java -cp .:antlr-4.0-complete.jar Main
(on Windows, the last command would be: java -cp .;antlr-4.0-complete.jar Main
)
After running Main
, nothing happens (of course?). This is because we didn't implement any of the rules in our EvalVisitor
class. To be able to evaluate the file test.mu
properly, we need to provide a proper implementation for the following rules:
if_stat
andExpr
orExpr
plusExpr
assignment
idAtom
booleanAtom
stringAtom
log
#4. Visitor II & Test II
Here's a implementation of these rules:
import org.antlr.v4.runtime.misc.NotNull; import java.util.HashMap; import java.util.List; import java.util.Map; public class EvalVisitor extends MuBaseVisitor<Value> { // used to compare floating point numbers public static final double SMALL_VALUE = 0.00000000001; // store variables (there's only one global scope!) private Map<String, Value> memory = new HashMap<String, Value>(); // assignment/id overrides @Override public Value visitAssignment(MuParser.AssignmentContext ctx) { String id = ctx.ID().getText(); Value value = this.visit(ctx.expr()); return memory.put(id, value); } @Override public Value visitIdAtom(MuParser.IdAtomContext ctx) { String id = ctx.getText(); Value value = memory.get(id); if(value == null) { throw new RuntimeException("no such variable: " + id); } return value; } // atom overrides @Override public Value visitStringAtom(MuParser.StringAtomContext ctx) { String str = ctx.getText(); // strip quotes str = str.substring(1, str.length() - 1).replace("\"\"", "\""); return new Value(str); } @Override public Value visitNumberAtom(MuParser.NumberAtomContext ctx) { return new Value(Double.valueOf(ctx.getText())); } @Override public Value visitBooleanAtom(MuParser.BooleanAtomContext ctx) { return new Value(Boolean.valueOf(ctx.getText())); } @Override public Value visitNilAtom(MuParser.NilAtomContext ctx) { return new Value(null); } // expr overrides @Override public Value visitParExpr(MuParser.ParExprContext ctx) { return this.visit(ctx.expr()); } @Override public Value visitPowExpr(MuParser.PowExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); return new Value(Math.pow(left.asDouble(), right.asDouble())); } @Override public Value visitUnaryMinusExpr(MuParser.UnaryMinusExprContext ctx) { Value value = this.visit(ctx.expr()); return new Value(-value.asDouble()); } @Override public Value visitNotExpr(MuParser.NotExprContext ctx) { Value value = this.visit(ctx.expr()); return new Value(!value.asBoolean()); } @Override public Value visitMultiplicationExpr(@NotNull MuParser.MultiplicationExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); switch (ctx.op.getType()) { case MuParser.MULT: return new Value(left.asDouble() * right.asDouble()); case MuParser.DIV: return new Value(left.asDouble() / right.asDouble()); case MuParser.MOD: return new Value(left.asDouble() % right.asDouble()); default: throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]); } } @Override public Value visitAdditiveExpr(@NotNull MuParser.AdditiveExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); switch (ctx.op.getType()) { case MuParser.PLUS: return left.isDouble() && right.isDouble() ? new Value(left.asDouble() + right.asDouble()) : new Value(left.asString() + right.asString()); case MuParser.MINUS: return new Value(left.asDouble() - right.asDouble()); default: throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]); } } @Override public Value visitRelationalExpr(@NotNull MuParser.RelationalExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); switch (ctx.op.getType()) { case MuParser.LT: return new Value(left.asDouble() < right.asDouble()); case MuParser.LTEQ: return new Value(left.asDouble() <= right.asDouble()); case MuParser.GT: return new Value(left.asDouble() > right.asDouble()); case MuParser.GTEQ: return new Value(left.asDouble() >= right.asDouble()); default: throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]); } } @Override public Value visitEqualityExpr(@NotNull MuParser.EqualityExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); switch (ctx.op.getType()) { case MuParser.EQ: return left.isDouble() && right.isDouble() ? new Value(Math.abs(left.asDouble() - right.asDouble()) < SMALL_VALUE) : new Value(left.equals(right)); case MuParser.NEQ: return left.isDouble() && right.isDouble() ? new Value(Math.abs(left.asDouble() - right.asDouble()) >= SMALL_VALUE) : new Value(!left.equals(right)); default: throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]); } } @Override public Value visitAndExpr(MuParser.AndExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); return new Value(left.asBoolean() && right.asBoolean()); } @Override public Value visitOrExpr(MuParser.OrExprContext ctx) { Value left = this.visit(ctx.expr(0)); Value right = this.visit(ctx.expr(1)); return new Value(left.asBoolean() || right.asBoolean()); } // log override @Override public Value visitLog(MuParser.LogContext ctx) { Value value = this.visit(ctx.expr()); System.out.println(value); return value; } // if override @Override public Value visitIf_stat(MuParser.If_statContext ctx) { List<MuParser.Condition_blockContext> conditions = ctx.condition_block(); boolean evaluatedBlock = false; for(MuParser.Condition_blockContext condition : conditions) { Value evaluated = this.visit(condition.expr()); if(evaluated.asBoolean()) { evaluatedBlock = true; // evaluate this block whose expr==true this.visit(condition.stat_block()); break; } } if(!evaluatedBlock && ctx.stat_block() != null) { // evaluate the else-stat_block (if present == not null) this.visit(ctx.stat_block()); } return Value.VOID; } // while override @Override public Value visitWhile_stat(MuParser.While_statContext ctx) { Value value = this.visit(ctx.expr()); while(value.asBoolean()) { // evaluate the code block this.visit(ctx.stat_block()); // evaluate the expression value = this.visit(ctx.expr()); } return Value.VOID; } }
When you re-compile and run Main
, the following would be printed to your console:
2 :: a=true, b=false Done!
For an implementation of all other rules, see: https://github.com/bkiers/Mu
From @pwwpche, in the comments:
for those using jdk1.8 and encounter
IndexOutOfBoundsException
, antlr 4.0 is somehow not compatible with jdk1.8. Download antlr-4.6-complete.jar, and replaceexpr POW<assoc=right> expr
with<assoc=right>expr POW expr
will eliminate the error and warnings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With