How to define tokens that can appear in multiple lexical modes in ANTLR4?

Tags:

I am learning ANTLR4 and was trying to play with lexical modes. How can I have the same token appear in multiple lexical modes? As a very simple example, let's say my grammar has two modes, and I want to match white space and end-of-lines in both of them how can I do it without ending with WS_MODE1 and WS_MODE2 for example. Is there a way to reuse the same definition in both cases? I am hoping to get WS tokens in the output stream for all white space irrespective of the mode. The same applies to EOL and other keywords that can appear in both modes.

695

asked Apr 04 '13 09:04

medhat

1 Answers

The rules have to have different names, but you can use the -> type(...) lexer command to give them the same type.

WS : [ \t]+;

mode Mode1;

    Mode1_WS : WS -> type(WS);

mode Mode2;

    Mode2_WS : WS -> type(WS);

Even though Mode1_WS and Mode2_WS are not fragment rules, the code generator will see the type command and know that you reassigned their types, so it will not define tokens for them.

answered Oct 24 '22 09:10

Sam Harwell

Related questions
                            
                                Lexical Analyser In Java
                            
                                Lexer/parser to generate Scala code from BNF grammar
                            
                                Non-left-recursive PEG grammar for an "expression"
                            
                                How to parse template languages in Ragel?
                            
                                direct-coded vs table-driven lexer?
                            
                                Lexing partial SQL in C#
                            
                                Where should I draw the line between lexer and parser?
                            
                                How would you parse indentation (python style)?
                            
                                Is it bad idea using regex to tokenize string for lexer?
                            
                                Parser How To in .NET
                            
                                Nested generic syntax ambiguity >>
                            
                                How does the C/C++ compiler distinguish the uses of the * operator (pointer, dereference operator, multiplication operator)?
                            
                                %option noinput nounput: what are they for?
                            
                                Why is this assembly code faster?
                            
                                Lex strings with single, double, or triple quotes
                            
                                In antlr4 lexer, How to have a rule that catches all remaining "words" as Unknown token?
                            
                                OCaml + Menhir Compiling/Writing
                            
                                Using ANTLR Parser and Lexer Separatly
                            
                                ANTLR4 what does ATN stand for?
                            
                                Does C# have (direct) flex/yacc port? Or what lexer/parser people use for C#? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to define tokens that can appear in multiple lexical modes in ANTLR4?

Tags:

lexer

antlr4

medhat

People also ask

1 Answers

Sam Harwell

Recent Activity

Donate For Us