ANTLR4: ignore white spaces in the input but not those in string literals

Question

I have a simple grammar as follows:

grammar SampleConfig;

line: ID (WS)* '=' (WS)* string;

ID: [a-zA-Z]+;
string: '"' (ESC|.)*? '"' ;
ESC : '\"' | '\\' ; // 2-char sequences \" and \
WS: [ 	]+ -> skip;

The spaces in the input are completely ignored, including those in the string literal.

final String input = "key = \"value with spaces in between\"";
final SampleConfigLexer l = new SampleConfigLexer(new ANTLRInputStream(input));
final SampleConfigParser p = new SampleConfigParser(new CommonTokenStream(l));
final LineContext context = p.line();
System.out.println(context.getChildCount() + ": " + context.getText());

This prints the following output:

3: key="valuewithspacesinbetween"

But, I expected the white spaces in the string literal to be retained, i.e.

3: key="value with spaces in between"

Is it possible to correct the grammar to achieve this behavior or should I just override CommonTokenStream to ignore whitespace during the parsing process?

Bart Kiers · Accepted Answer

You shouldn't expect any spaces in parser rules since you're skipping them in your lexer.

Either remove the skip command or make string a lexer rule:

STRING : '"' ( '\' [\"] | ~[\"
] )* '"';

ANTLR4: ignore white spaces in the input but not those in string literals

Tags:

java

antlr4

Vikdor

1 Answers

Bart Kiers

Recent Activity

Donate For Us

ANTLR4: ignore white spaces in the input but not those in string literals

Tags:

java

antlr4

Vikdor

1 Answers

Bart Kiers

Related questions

Recent Activity

Donate For Us