Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a parser equivalent of 'fragment' marking in ANTLR4?

Tags:

antlr4

Is there a way to tell ANTLR4 to inline the parser rule?

It seems reasonable to have such feature. After reading the book on ANTLR ("The Definitive ANTLR 4 Reference") I haven't found such possibility, but changes might've been introduced in the 4 years since the book was released, so I guess it is better to ask here.

Consider the following piece of grammar:

file: ( item | class_decl )*;
class_decl: 'class' class_name '{' type_decl* data_decl* code_decl* '}';
type_decl: 'typedef' ('bool'|'int'|'real') type_name;
const_decl: 'const' type_name const_name;
var_decl: 'var' type_name var_name;
...
fragment item: type_decl | data_decl | code_decl;
fragment data_decl: const_decl | var_decl;
fragment code_decl: function_decl | procedure_decl;
fragment class_name: ID;
fragment type_name: ID;
fragment const_name: ID;
fragment var_name: ID;

The rules marked as fragment are there for clarity/documentation and reusability, however from syntax point of view it is f.e. really a var_decl that is actual direct element of file or class_decl and I'd like to have it reflected in content of contexts created by the parser. All the intermediate contexts created for item, data_decl etc. are superfluous, needlessly take space and make it so visitor is bound to organizational structure of the grammar instead of its actual meaning.

To sum up - I'd expect ANTLR to turn the above grammar into the following before generation of a parser:

file: ( type_decl | const_decl | var_decl | function_decl | procedure_decl | class_decl )*;
class_decl: 'class' ID '{' type_decl* ( const_decl | var_decl )* ( function_decl | procedure_decl )* '}';
type_decl: 'typedef' ('bool'|'int'|'real') ID;
const_decl: 'const' ID ID;
var_decl: 'var' ID ID;
...
like image 842
ABW Avatar asked Jul 19 '17 12:07

ABW


People also ask

What is a fragment in Antlr?

ANTLR Lexer rules in v4 Fragments Fragments are reusable parts of lexer rules which cannot match on their own - they need to be referenced from a lexer rule.

How does ANTLR work?

ANTLR (ANother Tool for Language Recognition) is a tool for processing structured text. It does this by giving us access to language processing primitives like lexers, grammars, and parsers as well as the runtime to process text against them. It's often used to build tools and frameworks.

What is in ANTLR?

ANTLR is actually made up of two main parts: the tool, used to generate the lexer and parser, and the runtime, needed to run them. The tool will be needed just by you, the language engineer, while the runtime will be included in the final software created by you.

Why should a start rule end with EOF end of file in an Antlr grammar?

You should include an explicit EOF at the end of your entry rule any time you are trying to parse an entire input file. If you do not include the EOF , it means you are not trying to parse the entire input, and it's acceptable to parse only a portion of the input if it means avoiding a syntax error.


1 Answers

No, there is no such thing in parser rules. You could raise an issue/RFE in ANTLRs Github repo for such a thing: https://github.com/antlr/antlr4/issues

like image 175
Bart Kiers Avatar answered Oct 10 '22 16:10

Bart Kiers