Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split an ANTLR grammar file into multiple ones

I have a large grammar file, and plan to split it into multiple ones, so that I can reuse some of those smaller files in another grammar file. I have tried doing it but failed. Can you please tell if such a feature is available, and if so, please direct me towards an example.

like image 767
Nishanth Reddy Avatar asked Oct 07 '15 06:10

Nishanth Reddy


People also ask

Is ANTLR LL or LR?

In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development.

What is lexer and parser in ANTLR?

A lexer (often called a scanner) breaks up an input stream of characters into vocabulary symbols for a parser, which applies a grammatical structure to that symbol stream.

What can you do with ANTLR?

ANTLR is a powerful parser generator that you can use to read, process, execute, or translate structured text or binary files. It's widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day.

What is AST in ANTLR?

ANTLR helps you build intermediate form trees, or abstract syntax trees (ASTs), by providing grammar annotations that indicate what tokens are to be treated as subtree roots, which are to be leaves, and which are to be ignored with respect to tree construction.


2 Answers

If you want to split lexer and parser.

Lexer:

lexer grammar HelloLexer;
Hello : 'hello' ;
ID : [a-z]+ ;             // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines

Parser:

parser grammar HelloParser;
options { tokenVocab=HelloLexer; }
r  : Hello ID ;      

Remember to name the files HelloLexer.g4 and HelloParser.g4

if you want to import a whole grammar, then you should use the import keyword

grammar Hello;

import OtherGrammar;

Hello : 'hello' ;
ID : [a-z]+ ;             // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines    
r  : Hello ID ;
like image 53
XS_iceman Avatar answered Oct 01 '22 04:10

XS_iceman


You did not mention ANTLR version, so I am going to assume you are using the current one - 4.x. In ANTLR4 grammars can be imported with import keyword. Something like this:

File: CommonLexerRules.g4

lexer grammar CommonLexerRules;

ID  :   [a-zA-Z]+ ;
...

File: MyParser.g4

grammar MyParser;      
import CommonLexerRules; //includes all rules from lexer CommonLexerRules.g4
...

Rules in the “main grammar” override rules from imported grammars to implement inheritance. See more details here: https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Grammar+Structure#GrammarStructure-GrammarImports

like image 40
user3890638 Avatar answered Oct 01 '22 06:10

user3890638