Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate over a production in ANTLR

Lets suppose the following scenarios with 2 ANTLR grammars:

1)

expr     : antExp+;
antExpr  : '{' T '}' ;
T        : 'foo';

2)

expr     : antExpr; 
antExpr  : '{' T* '}' ;
T        : 'bar';

In both cases I need to know how to iterate over antExp+ and T*, because I need to generate an ArrayList of each element of them. Of course my grammar is more complex, but I think that this example should explain what I'm needing. Thank you!

like image 794
davidbuzatto Avatar asked Apr 30 '12 17:04

davidbuzatto


People also ask

Is ANTLR LL or LR?

In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing.

How does ANTLR work?

ANTLR (ANother Tool for Language Recognition) is a tool for processing structured text. It does this by giving us access to language processing primitives like lexers, grammars, and parsers as well as the runtime to process text against them. It's often used to build tools and frameworks.

How do you use ANTLR in Python?

What you need to do to get a parse tree: define a lexer and parser grammar. invoke ANTLR: it will generate a lexer and a parser in your target language (e.g., Java, Python, C#, JavaScript) use the generated lexer and parser: you invoke them passing the code to recognize and they return to you a parse tree.


1 Answers

Production rules in ANTLR can have one or more return types which you can reference inside a loop (a (...)* or (...)+). So, let's say you want to print each of the T's text the antExp rule matches. This could be done like this:

expr
 : (antExp {System.out.println($antExp.str);} )+
 ;

antExpr returns [String str]
 : '{' T '}' {$str = $T.text;}
 ;

T : 'foo';

The same principle holds for example grammar #2:

expr     : antExpr; 
antExpr  : '{' (T {System.out.println($T.text);} )* '}' ;
T        : 'bar';

EDIT

Note that you're not restricted to returning a single reference. Running the parser generated from:

grammar T;  

parse
 : ids {System.out.println($ids.firstId + "\n" + $ids.allIds);}
 ;

ids returns [String firstId, List<String> allIds]
@init{$allIds = new ArrayList<String>();}
@after{$firstId = $allIds.get(0);}
 : (ID {$allIds.add($ID.text);})+
 ;

ID    : ('a'..'z' | 'A'..'Z')+;
SPACE : ' ' {skip();};

on the input "aaa bbb ccc" would print the following:

aaa
[aaa, bbb, ccc]
like image 130
Bart Kiers Avatar answered Oct 17 '22 16:10

Bart Kiers