Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match any symbol in ANTLR parser (not lexer)?

How to match any symbol in ANTLR parser (not lexer)? Where is the complete language description for ANTLR4 parsers?

UPDATE

Is the answer is "impossible"?

like image 599
Suzan Cioc Avatar asked May 16 '13 22:05

Suzan Cioc


People also ask

What is ANTLR lexer?

A lexer (often called a scanner) breaks up an input stream of characters into vocabulary symbols for a parser, which applies a grammatical structure to that symbol stream.

Is ANTLR LL or LR?

In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development.

How does ANTLR parser work?

ANTLR (ANother Tool for Language Recognition) is a tool for processing structured text. It does this by giving us access to language processing primitives like lexers, grammars, and parsers as well as the runtime to process text against them. It's often used to build tools and frameworks.

What is AST in ANTLR?

ANTLR helps you build intermediate form trees, or abstract syntax trees (ASTs), by providing grammar annotations that indicate what tokens are to be treated as subtree roots, which are to be leaves, and which are to be ignored with respect to tree construction.


2 Answers

You first need to understand the roles of each part in parsing:

The lexer: this is the object that tokenizes your input string. Tokenizing means to convert a stream of input characters to an abstract token symbol (usually just a number).

The parser: this is the object that only works with tokens to determine the structure of a language. A language (written as one or more grammar files) defines the token combinations that are valid.

As you can see, the parser doesn't even know what a letter is. It only knows tokens. So your question is already wrong.

Having said that it would probably help to know why you want to skip individual input letters in your parser. Looks like your base concept needs adjustments.

like image 132
Mike Lischke Avatar answered Nov 10 '22 11:11

Mike Lischke


It depends what you mean by "symbol". To match any token inside a parser rule, use the . (DOT) meta char. If you're trying to match any character inside a parser rule, then you're out of luck, there is a strict separation between parser- and lexer rules in ANTLR. It is not possible to match any character inside a parser rule.

like image 28
Bart Kiers Avatar answered Nov 10 '22 09:11

Bart Kiers