Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I pretty-print productions and line numbers, using ANTLR4?

I'm trying to write a piece of code that will take an ANTLR4 parser and use it to generate ASTs for inputs similar to the ones given by the -tree option on grun (misc.TestRig). However, I'd additionally like for the output to include all the line number/offset information.

For example, instead of printing

(add (int 5) '+' (int 6))

I'd like to get

(add (int 5 [line 3, offset 6:7]) '+' (int 6 [line 3, offset 8:9]) [line 3, offset 5:10])

Or something similar.

There aren't a tremendous number of visitor examples for ANTLR4 yet, but I am pretty sure I can do most of this by copying the default implementation for toStringTree (used by grun). However, I do not see any information about the line numbers or offsets.

I expected to be able to write super simple code like this:

String visit(ParseTree t) {
    return "(" + t.productionName + t.visitChildren() + t.lineNumber + ")";
}

but it doesn't seem to be this simple. I'm guessing I should be able to get line number information from the parser, but I haven't figured out how to do so. How can I grab this line number/offset information in my traversal?


To fill in the few blanks in the solution below, I used:

List<String> ruleNames = Arrays.asList(parser.getRuleNames());
parser.setBuildParseTree(true);
ParserRuleContext prc = parser.program();
ParseTree tree = prc;

to get the tree and the ruleNames. program is the name for the top production in my grammar.

like image 482
Chucky Ellison Avatar asked Oct 13 '13 21:10

Chucky Ellison


People also ask

Can you create your own programming language with ANTLR V4?

If you are a programmer, at some point in your life, you must have wondered if you too could ever create your very own language, one that conforms to your ideals. Well, thanks to ANTLR v4, doing so has become easier than ever. In this tutorial, I’ll show you how to create a very simple programming language using ANTLR4 and Java.

Can ANTLR 4 be converted to Python?

The examples from the ANTLR 4 book converted to Python are here. There are 2 Python targets: Python2 and Python3. This is because there is only limited compatibility between those 2 versions of the language. Please refer to the Python documentation for full details. How to create a Python lexer or parser?

How do I create a grammar in ANTLR?

Accordingly, create a new file call GYOO.g4 inside the src folder and add the following grammar to it: The grammar should be fairly intuitive to you if you are familiar with BNF. Now that we have a grammar file, we can pass it as an input to the org.antlr.v4.Tool class and generate a parser and lexer for it.

How to print line numbers using AWK in PHP?

The print command can be used for the awk in order to print line numbers. In the following example, we start printing numbers from 0. The i variable is used to store line numbers and ++ operators have used the increase the i variable in every line. In the following example, we print line numbers for the file named phpinfo.php .


1 Answers

The Trees.toStringTree method can be implemented using a ParseTreeListener. The following listener produces exactly the same output as Trees.toStringTree.

public class TreePrinterListener implements ParseTreeListener {
    private final List<String> ruleNames;
    private final StringBuilder builder = new StringBuilder();

    public TreePrinterListener(Parser parser) {
        this.ruleNames = Arrays.asList(parser.getRuleNames());
    }

    public TreePrinterListener(List<String> ruleNames) {
        this.ruleNames = ruleNames;
    }

    @Override
    public void visitTerminal(TerminalNode node) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
    }

    @Override
    public void visitErrorNode(ErrorNode node) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
    }

    @Override
    public void enterEveryRule(ParserRuleContext ctx) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        if (ctx.getChildCount() > 0) {
            builder.append('(');
        }

        int ruleIndex = ctx.getRuleIndex();
        String ruleName;
        if (ruleIndex >= 0 && ruleIndex < ruleNames.size()) {
            ruleName = ruleNames.get(ruleIndex);
        }
        else {
            ruleName = Integer.toString(ruleIndex);
        }

        builder.append(ruleName);
    }

    @Override
    public void exitEveryRule(ParserRuleContext ctx) {
        if (ctx.getChildCount() > 0) {
            builder.append(')');
        }
    }

    @Override
    public String toString() {
        return builder.toString();
    }
}

The class can be used as follows:

List<String> ruleNames = ...;
ParseTree tree = ...;

TreePrinterListener listener = new TreePrinterListener(ruleNames);
ParseTreeWalker.DEFAULT.walk(listener, tree);
String formatted = listener.toString();

The class can be modified to produce the information in your output by updating the exitEveryRule method:

@Override
public void exitEveryRule(ParserRuleContext ctx) {
    if (ctx.getChildCount() > 0) {
        Token positionToken = ctx.getStart();
        if (positionToken != null) {
            builder.append(" [line ");
            builder.append(positionToken.getLine());
            builder.append(", offset ");
            builder.append(positionToken.getStartIndex());
            builder.append(':');
            builder.append(positionToken.getStopIndex());
            builder.append("])");
        }
        else {
            builder.append(')');
        }
    }
}
like image 75
Sam Harwell Avatar answered Oct 13 '22 02:10

Sam Harwell