Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In ANTLR, how do you specify a specific number of repetitions?

Tags:

antlr

I'm using ANTLR to specify a file format that contains lines that cannot exceed 254 characters (excluding line endings). How do I encode this in the grammer, short of doing:

line : CHAR? CHAR? CHAR? CHAR? ... (254 times)
like image 575
jjkparker Avatar asked Mar 09 '10 14:03

jjkparker


1 Answers

This can be handled by using a semantic predicate.

First write your grammar in such a way that it does not matter how long your lines are. An example would look like this:

grammar Test;

parse
  :  line* EOF
  ;

line
  :  Char+ (LineBreak | EOF)
  |  LineBreak // empty line!
  ;

LineBreak : '\r'? '\n' | '\r' ;
Char      : ~('\r' | '\n') ;

and then add the "predicate" to the line rule:

grammar Test;

@parser::members {
    public static void main(String[] args) throws Exception {
        String source = "abcde\nfghij\nklm\nnopqrst";
        ANTLRStringStream in = new ANTLRStringStream(source);
        TestLexer lexer = new TestLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        TestParser parser = new TestParser(tokens);
        parser.parse();
    }  
}

parse
  :  line* EOF
  ;

line
  :  (c+=Char)+ {$c.size()<=5}? (LineBreak | EOF)
  |  LineBreak // empty line!
  ;

LineBreak : '\r'? '\n' | '\r' ;
Char      : ~('\r' | '\n') ;

The c+=Char will construct an ArrayList containing all characters in the line. The {$c.size()<=5}? causes to throw an exception when the ArrayList's size exceeds 5.

I also added a main method in the parser so you can test it yourself:

// *nix/MacOSX
java -cp antlr-3.2.jar org.antlr.Tool Test.g
javac -cp antlr-3.2.jar *.java
java -cp .:antlr-3.2.jar TestParser

// Windows
java -cp antlr-3.2.jar org.antlr.Tool Test.g
javac -cp antlr-3.2.jar *.java
java -cp .;antlr-3.2.jar TestParser

which will output:

line 0:-1 rule line failed predicate: {$c.size()<=5}?

HTH

like image 62
Bart Kiers Avatar answered May 20 '23 08:05

Bart Kiers