Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR4 using HIDDEN channel causes errors while using skip does not

In my grammar I use:

WS: [ \t\r\n]+ -> skip;

when I change this to use HIDDEN channel:

WS: [ \t\r\n]+ -> channel(HIDDEN);

I receive errors (extraneous input ' '...) I did not receive while using 'skip'. I thought, that skipping and sending to a channel does not differ if it comes to a content passed to a parser.

Below you can find a code excerpt in which the parser is executed:

    CharStream charStream = new ANTLRInputStream(formulaString);
    FormulaLexer lexer = new FormulaLexer(charStream);
    BufferedTokenStream tokens = new BufferedTokenStream(lexer);
    FormulaParser parser = new FormulaParser(tokens);
    ParseTree tree = parser.startRule();

    StartRuleVisitor startRuleVisitor = new StartRuleVisitor();
    startRuleVisitor.visit(tree);

    VariableVisitor variableVisitor = new VariableVisitor(tokens);
    variableVisitor.visit(tree);

And a grammar itself:

grammar Formula;


startRule
   : variable RELATION_OPERATOR integer
   ;

integer
   : DIGIT+
   ;

identifier
   : (LETTER | DIGIT) ( DIGIT | LETTER | '_' | '.')+
   ;

tableId
   : 'T_' (identifier | WILDCARD)
   ;

rowId
   : 'R_' (identifier | WILDCARD)
   ;

columnId
   : 'C_' (identifier | WILDCARD)
   ;

sheetId
   : 'S_' (identifier | WILDCARD)
   ;

variable
   : L_CURLY_BRACKET cellIdComponent (COMMA cellIdComponent)+ R_CURLY_BRACKET
   ;

cellIdComponent
   : tableId | rowId | columnId | sheetId
   ;

COMMA
   : ','
   ;

RELATION_OPERATOR
   : EQ
   ;

WILDCARD
   : 'NNN'
   ;

L_CURLY_BRACKET
   : '{'
   ;

R_CURLY_BRACKET
   : '}'
   ;


LETTER
   : ('a' .. 'z') | ('A' .. 'Z')
   ;

DIGIT
   : ('0' .. '9')
   ;


EQ
   : '='
   | 'EQ' | 'eq'
   ;


WS
   : [ \t\r\n]+ -> channel(HIDDEN)
   ;

String I try to parse:

{T_C 00.01, R_010,   C_010} = 1

Output I get with channel(HIDDEN) used:

line 1:4 extraneous input ' ' expecting {'_', '.', LETTER, DIGIT}
line 1:11 extraneous input ' ' expecting {'T_', 'R_', 'C_', 'S_'}
line 1:18 extraneous input '   ' expecting {'T_', 'R_', 'C_', 'S_'}
line 1:27 extraneous input ' ' expecting RELATION_OPERATOR
line 1:29 extraneous input ' ' expecting DIGIT

But if I change channel(HIDDEN) to 'skip' there are no errors.

What is more, I have observed that for more complex grammar than this i get 'no viable alternative at input...' if I use channel(HIDDEN) and once again the error disappear for the 'skip'.

Do you know what may be the cause of it?

like image 571
r2mzes Avatar asked Oct 29 '22 12:10

r2mzes


1 Answers

You should use CommonTokenStream instead of BufferedTokenStream. See BufferedTokenStream description on github:

This token stream ignores the value of {@link Token#getChannel}. If your parser requires the token stream filter tokens to only those on a particular channel, such as {@link Token#DEFAULT_CHANNEL} or {@link Token#HIDDEN_CHANNEL}, use a filtering token stream such a {@link CommonTokenStream}.

like image 94
Ivan Kochurkin Avatar answered Nov 11 '22 14:11

Ivan Kochurkin