Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I access alternate labels in ANTLR4 while generically traversing a parse tree?

How can I access alternate labels in ANTLR4 while generically traversing a parse tree? Or alternatively, is there any way of replicating the functionality of the ^ operator of ANTLR3, as that would do the trick.

I'm trying to write an AST pretty printer for any ANTLR4 grammar adhering to a simple methodology (like naming productions with alternate labels). I'd like to be able to pretty print a term like 3 + 5 as (int_expression (plus (int_literal 3) (int_literal 5))), or something similar, given a grammar like the following:

int_expression 
    : int_expression '+' int_expression # plus
    | int_expression '-' int_expression # minus
    | raw_int                           # int_literal
    ;
raw_int
    : Int
    ;
Int : [0-9]+ ;

I am unable to effectively give names to the plus and minus productions, because pulling them out into their own production causes the tool to complain that the rules are mutually left-recursive. If I can't pull them out, how can I give these productions names?

Note 1: I was able to get rid of the + argument methodologically by putting "good" terminals (e.g., the Int above) in special productions (productions starting with a special prefix, like raw_). Then I could print only those terminals whose parent productions are named "raw_..." and elide all others. This worked great for getting rid of +, while keeping 3 and 5 in the output. This could be done with a ! in ANTLR3.

Note 2: I understand that I could write a specialized pretty printer or use actions for each production of a given language, but I'd like to use ANTLR4 to parse and generate ASTs for a variety of languages, and it seems like I should be able to write such a simple pretty printer generically. Said another way, I only care about getting ASTs, and I'd rather not have to encumber each grammar with a tailored pretty printer just to get an AST. Perhaps I should just go back to ANTLR3?

like image 346
Chucky Ellison Avatar asked Jan 20 '26 05:01

Chucky Ellison


1 Answers

The API doesn't contain a method to access the alternate labels.

However there is a workaround. ANTLR4 uses the alternate labels to generate java class names and those java classes can be accessed at run time.

For example: to access alternate labels in ANTLR4 while generically traversing a parse tree (with a listener) you can use the following function:

// Return the embedded alternate label between
// "$" and "Context" from the class name
String getCtxName(ParserRuleContext ctx) {
    String str = ctx.getClass().getName();
    str = str.substring(str.indexOf("$")+1,str.lastIndexOf("Context"));
    str = str.toLowerCase();
    return str;
}

Example use:

@Override
public void exitEveryRule(ParserRuleContext ctx) {
    System.out.println(getCtxName(ctx));
}
like image 83
chris Avatar answered Jan 22 '26 04:01

chris