I have this lexer rule defined in my ANTLR v3 grammar file - it maths text in double quotes. I need to convert it to ANTLR v4. ANTLR compiler throws an error 'syntax error: mismatched input '@' expecting COLON while matching a lexer rule' (in @init line). Can lexer rule contain a @init block ? How this should be rewritten ?
DOUBLE_QUOTED_CHARACTERS
@init
{
int doubleQuoteMark = input.mark();
int semiColonPos = -1;
}
: ('"' WS* '"') => '"' WS* '"' { $channel = HIDDEN; }
{
RecognitionException re = new RecognitionException("Illegal empty quotes\"\"!", input);
reportError(re);
}
| '"' (options {greedy=false;}: ~('"'))+
('"'|';' { semiColonPos = input.index(); } ('\u0020'|'\t')* ('\n'|'\r'))
{
if (semiColonPos >= 0)
{
input.rewind(doubleQuoteMark);
RecognitionException re = new RecognitionException("Missing closing double quote!", input);
reportError(re);
input.consume();
}
else
{
setText(getText().substring(1, getText().length()-1));
}
}
;
Sample data:
I think this is the right way to do this.
DOUBLE_QUOTED_CHARACTERS
:
{
int doubleQuoteMark = input.mark();
int semiColonPos = -1;
}
(
('"' WS* '"') => '"' WS* '"' { $channel = HIDDEN; }
{
RecognitionException re = new RecognitionException("Illegal empty quotes\"\"!", input);
reportError(re);
}
| '"' (options {greedy=false;}: ~('"'))+
('"'|';' { semiColonPos = input.index(); } ('\u0020'|'\t')* ('\n'|'\r'))
{
if (semiColonPos >= 0)
{
input.rewind(doubleQuoteMark);
RecognitionException re = new RecognitionException("Missing closing double quote!", input);
reportError(re);
input.consume();
}
else
{
setText(getText().substring(1, getText().length()-1));
}
}
)
;
There are some other errors as well in above like WS .. => ... but I am not correcting them as part of this answer. Just to keep things simple. I took hint from here
Just to hedge against that link moving or becoming invalid after sometime, quoting the text as is:
Lexer actions can appear anywhere as of 4.2, not just at the end of the outermost alternative. The lexer executes the actions at the appropriate input position, according to the placement of the action within the rule. To execute a single action for a role that has multiple alternatives, you can enclose the alts in parentheses and put the action afterwards:
END : ('endif'|'end') {System.out.println("found an end");} ;
The action conforms to the syntax of the target language. ANTLR copies the action’s contents into the generated code verbatim; there is no translation of expressions like $x.y as there is in parser actions.
Only actions within the outermost token rule are executed. In other words, if STRING calls ESC_CHAR and ESC_CHAR has an action, that action is not executed when the lexer starts matching in STRING.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With