I'm trying to implement an expression/formula language in ANTLR4 and having a problem with whitespace handling. In most cases I don't care about whitespace, so I have the "standard" lexer rule to send it to the HIDDEN channel, i.e.
// Whitespace
WS
: ( ' ' | '\t' |'\r' | '\n' ) -> channel(HIDDEN)
;
However I have one operator which doesn't allow whitespace either before or after, and I can't see how to handle the situation without changing the WS lexer rule to leave the whitespace in the default channel and having explicit WS?
terms in all of my other parser rules (there are quite a lot of them).
As simplified example, I created the following grammar for an imaginary predicate language:
grammar Logik;
/*
* Parser Rules
*/
ruleExpression
: orExpression
;
orExpression
: andExpression ( 'OR' andExpression)*
;
andExpression
: primaryExpression ( 'AND' primaryExpression)*
;
primaryExpression
: variableExpression
| '(' ruleExpression ')'
;
variableExpression
: IDENTIFIER ( '.' IDENTIFIER )*
;
/*
* Lexer Rules
*/
IDENTIFIER
: LETTER LETTERORDIGIT*
;
fragment LETTER : [a-zA-Z_];
fragment LETTERORDIGIT : [a-zA-Z0-9_];
// Whitespace
WS
: ( ' ' | '\t' |'\r' | '\n' ) -> channel(HIDDEN)
;
As it stands, this parses A OR B AND C.D
and A OR B AND C. D
successfully - what I need is for the .
operator to not allow whitespace, so that the second expression isn't valid.
You can get the token from other channels like this:
variableExpression
: IDENTIFIER ( '.' {_input.get(_input.index() -1).getType() != WS}? IDENTIFIER )*
;
A OR B AND C.D
is OK and
A OR B AND C. D
will print an error
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With