Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR Parse tree modification

I'm using ANTLR4 to create a parse tree for my grammar, what I want to do is modify certain nodes in the tree. This will include removing certain nodes and inserting new ones. The purpose behind this is optimization for the language I am writing. I have yet to find a solution to this problem. What would be the best way to go about this?

like image 931
Marc Deleuran Avatar asked May 06 '14 06:05

Marc Deleuran


2 Answers

Another approach would be to write a ParseTreeVisitor that converts the tree back to a string. (This can be trivial in some cases, because you are only calling TerminalNode.getText() and concatenate in aggregateResult(..).)

You then add the modifications to this visitor so that the resulting string representation contains the modifications you try to achieve.

Then parse the string and you get a parse tree with the desired modifications.

This is certainly hackish in some ways, since you parse the string twice. On the other hand the solution does not rely on antlr implementation details.

like image 155
Joe23 Avatar answered Sep 28 '22 03:09

Joe23


While there is currently no real support or tools for tree rewriting, it is very possible to do. It's not even that painful.

The ParseTreeListener or your MyBaseListener can be used with a ParseTreeWalker to walk your parse tree.

From here, you can remove nodes with ParserRuleContext.removeLastChild(), however when doing this, you have to watch out for ParseTreeWalker.walk:

public void walk(ParseTreeListener listener, ParseTree t) {
    if ( t instanceof ErrorNode) {
        listener.visitErrorNode((ErrorNode)t);
        return;
    }
    else if ( t instanceof TerminalNode) {
        listener.visitTerminal((TerminalNode)t);
        return;
    }
    RuleNode r = (RuleNode)t;
    enterRule(listener, r);
    int n = r.getChildCount();
    for (int i = 0; i<n; i++) {
        walk(listener, r.getChild(i));
    }
    exitRule(listener, r);
}

You must replace removed nodes with something if the walker has visited parents of those nodes, I usually pick empty ParseRuleContext objects (this is because of the cached value of n in the method above). This prevents the ParseTreeWalker from throwing a NPE.

When adding nodes, make sure to set the mutable parent on the ParseRuleContext to the new parent. Also, because of the cached n in the method above, a good strategy is to detect where the changes need to be before you hit where you want your changes to go in the walk, so the ParseTreeWalker will walk over them in the same pass (other wise you might need multiple passes...)

Your pseudo code should look like this:

public void enterRewriteTarget(@NotNull MyParser.RewriteTargetContext ctx){
    if(shouldRewrite(ctx)){
        ArrayList<ParseTree> nodesReplaced = replaceNodes(ctx);
        addChildTo(ctx, createNewParentFor(nodesReplaced));
    }
}

I've used this method to write a transpiler that compiled a synchronous internal language into asynchronous javascript. It was pretty painful.

like image 37
Nthalk Avatar answered Sep 28 '22 03:09

Nthalk