Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding Comments to Antlr Java 8 grammar

I wish to have 'comments' be recorded into the AST (not to have anything done with them, but to be stored for later reproduction) when using the Java8 grammar for antlr. https://github.com/antlr/grammars-v4/blob/master/java8/Java8.g4

IE: I want to read a java source code file into a AST then output it again eventually but to include the

I am wondering if there is a -simple- adjustment to the grammar that would allow this... (or if my naive idea of having to integrate 'comments' into each expression is the sad truth of the matter...) and if so... what is it?

COMMENT
    :   '/*' .*? '*/' -> skip
    ;

LINE_COMMENT
    :   '//' ~[\r\n]* -> skip
    ;
like image 759
mawalker Avatar asked Oct 18 '22 08:10

mawalker


1 Answers

from what I can see, you -can- keep the comments in their own 'channel' via:

adding this to the grammar:

@lexer::members {
    public static final int WHITESPACE = 1;
    public static final int COMMENTS = 2;
}

and changing to this:

COMMENT
    : '/*' .*? '*/' -> channel(COMMENTS)
    ;

LINE_COMMENT
    : '//' ~[\r\n]* -> channel(COMMENTS)
    ;

from: https://stackoverflow.com/a/17960734/2801237

the official 'documentation'(actually it looks like his book is really the 'real' documentation) mentions this briefly:

https://github.com/antlr/antlr4/blob/master/doc/grammars.md

and the (one version of the) book says

you can send different tokens to the parser on different channels. For example, you might want whitespace and regular comments on one channel and Javadoc comments on another when parsing Java


This is the warnings from the antlr generation that I get: (I read that you can ignore these, but... there might be a better way to do this)

warning(155): java8comments.g4:1725:35: rule WS contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output

warning(155): java8comments.g4:1729:33: rule DOC_COMMENT contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output

warning(155): java8comments.g4:1733:31: rule COMMENT contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output

warning(155): java8comments.g4:1737:31: rule LINE_COMMENT contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output

like image 111
mawalker Avatar answered Oct 21 '22 04:10

mawalker