Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to understand ATN graph generated for ANTLR grammar?

I have 2 simple lexer rules in my ANTLR4 grammar:

fragment Attrs : '.' ARCH; 
fragment ARCH : 'IA32' | 'X64' | 'IPF' | 'EBC' | 'common';

The generated ATN with ANTLR4.7 is like this (Visual Studio Code):

enter image description here

I searched some references about "ATN", such as this one.

It's beautiful but I don't understand it:

  • What does the number and label in the node mean?
  • What does the epsilon symbol on the arrow line mean?
  • What do the grey and red nodes mean?
like image 313
smwikipedia Avatar asked Aug 03 '17 01:08

smwikipedia


People also ask

What is ATN in Antlr?

Augmented Transition Networks, the description in the context of ANTLR could be found e.g. here.

How do you write an Antlr grammar?

Add the details like where you want the output like the files for Lexer and Parser, the location of the grammar file (. g4 file) . Leave the Grammar file encoding, if you have not used any encoding. Add the package name that you want to see in the Java file in which the lexer and parser files will be created.

What is an Antlr grammar?

ANTLR (ANother Tool for Language Recognition) is a tool for processing structured text. It does this by giving us access to language processing primitives like lexers, grammars, and parsers as well as the runtime to process text against them. It's often used to build tools and frameworks.


1 Answers

The ATN graph in the image represents a single rule, consisting of ATN states produced by the parser generator. These are mostly interesting for developers who want to write code that uses the ATN in some way (e.g. for code completion). Usually you don't need this information for your grammar work. It might also come in handy to see how the ATN graph changes when you change something in the grammar (for fine tuning of your grammar).

What you see in the image are circles for ATN states, with their unique ID (no 2 states share the same state number) along with a label indicating the state's type (rule start/end state, basic state etc.). You can get a bit more info by hovering with the mouse over a state until you get the tooltip. The rounded rectangle describes a rule called by this rule.

Most states are connected via transitions which describe the direction a parser has to walk when executing this state machine. A transition can be executed without consuming input (which is then called an epsilon transition, marked by that little epsilon symbol) or requires certain input to match (which is denoted as label in the ATN and also attached to the transition arrow in the graph image).

like image 76
Mike Lischke Avatar answered Sep 28 '22 12:09

Mike Lischke