Using ANTLR 3.3?

Tags:

I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.

Can anyone give me a simple example of how to create a grammar and use it in a short program?

I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.

870

asked Dec 09 '10 08:12

mpen

2 Answers

Let's say you want to parse simple expressions consisting of the following tokens:

- subtraction (also unary);
+ addition;
* multiplication;
/ division;
(...) grouping (sub) expressions;
integer and decimal numbers.

An ANTLR grammar could look like this:

Click to copy

grammar Expression;  options {   language=CSharp2; }  parse   :  exp EOF    ;  exp   :  addExp   ;  addExp   :  mulExp (('+' | '-') mulExp)*   ;  mulExp   :  unaryExp (('*' | '/') unaryExp)*   ;  unaryExp   :  '-' atom    |  atom   ;  atom   :  Number   |  '(' exp ')'    ;  Number   :  ('0'..'9')+ ('.' ('0'..'9')+)?   ;

Now to create a proper AST, you add output=AST; in your options { ... } section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:

add ^ and ! after your tokens. The ^ causes the token to become a root and the ! excludes the token from the ast;
by using "rewrite rules": ... -> ^(Root Child Child ...).

Take the rule foo for example:

Click to copy

foo   :  TokenA TokenB TokenC TokenD   ;

and let's say you want TokenB to become the root and TokenA and TokenC to become its children, and you want to exclude TokenD from the tree. Here's how to do that using option 1:

Click to copy

foo   :  TokenA TokenB^ TokenC TokenD!   ;

and here's how to do that using option 2:

Click to copy

foo   :  TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)   ;

So, here's the grammar with the tree operators in it:

Click to copy

grammar Expression;  options {   language=CSharp2;   output=AST; }  tokens {   ROOT;   UNARY_MIN; }  @parser::namespace { Demo.Antlr } @lexer::namespace { Demo.Antlr }  parse   :  exp EOF -> ^(ROOT exp)   ;  exp   :  addExp   ;  addExp   :  mulExp (('+' | '-')^ mulExp)*   ;  mulExp   :  unaryExp (('*' | '/')^ unaryExp)*   ;  unaryExp   :  '-' atom -> ^(UNARY_MIN atom)   |  atom   ;  atom   :  Number   |  '(' exp ')' -> exp   ;  Number   :  ('0'..'9')+ ('.' ('0'..'9')+)?   ;  Space    :  (' ' | '\t' | '\r' | '\n'){Skip();}   ;

I also added a Space rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... } first, then tokens { ... } and finally the @... {}-namespace declarations).

That's it.

Now generate a lexer and parser from your grammar file:

Click to copy

 java -cp antlr-3.2.jar org.antlr.Tool Expression.g

and put the .cs files in your project together with the C# runtime DLL's.

You can test it using the following class:

Click to copy

using System; using Antlr.Runtime; using Antlr.Runtime.Tree; using Antlr.StringTemplate;  namespace Demo.Antlr {   class MainClass   {     public static void Preorder(ITree Tree, int Depth)      {       if(Tree == null)       {         return;       }        for (int i = 0; i < Depth; i++)       {         Console.Write("  ");       }        Console.WriteLine(Tree);        Preorder(Tree.GetChild(0), Depth + 1);       Preorder(Tree.GetChild(1), Depth + 1);     }      public static void Main (string[] args)     {       ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5");        ExpressionLexer Lexer = new ExpressionLexer(Input);       CommonTokenStream Tokens = new CommonTokenStream(Lexer);       ExpressionParser Parser = new ExpressionParser(Tokens);       ExpressionParser.parse_return ParseReturn = Parser.parse();       CommonTree Tree = (CommonTree)ParseReturn.Tree;       Preorder(Tree, 0);     }   } }

which produces the following output:

Click to copy

 ROOT   *     +       12.5       /         56         UNARY_MIN           7     0.5

which corresponds to the following AST:

alt text

(diagram created using graph.gafol.net)

Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.

In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.

Here's an example:

Click to copy

grammar Expression;  options {   language=CSharp2; }  @parser::namespace { Demo.Antlr } @lexer::namespace { Demo.Antlr }  parse returns [double value]   :  exp EOF {$value = $exp.value;}   ;  exp returns [double value]   :  addExp {$value = $addExp.value;}   ;  addExp returns [double value]   :  a=mulExp       {$value = $a.value;}      ( '+' b=mulExp {$value += $b.value;}      | '-' b=mulExp {$value -= $b.value;}      )*   ;  mulExp returns [double value]   :  a=unaryExp       {$value = $a.value;}      ( '*' b=unaryExp {$value *= $b.value;}      | '/' b=unaryExp {$value /= $b.value;}      )*   ;  unaryExp returns [double value]   :  '-' atom {$value = -1.0 * $atom.value;}   |  atom     {$value = $atom.value;}   ;  atom returns [double value]   :  Number      {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}   |  '(' exp ')' {$value = $exp.value;}   ;  Number   :  ('0'..'9')+ ('.' ('0'..'9')+)?   ;  Space    :  (' ' | '\t' | '\r' | '\n'){Skip();}   ;

which can be tested with the class:

Click to copy

using System; using Antlr.Runtime; using Antlr.Runtime.Tree; using Antlr.StringTemplate;  namespace Demo.Antlr {   class MainClass   {     public static void Main (string[] args)     {       string expression = "(12.5 + 56 / -7) * 0.5";       ANTLRStringStream Input = new ANTLRStringStream(expression);         ExpressionLexer Lexer = new ExpressionLexer(Input);       CommonTokenStream Tokens = new CommonTokenStream(Lexer);       ExpressionParser Parser = new ExpressionParser(Tokens);       Console.WriteLine(expression + " = " + Parser.parse());     }   } }

and produces the following output:

Click to copy

 (12.5 + 56 / -7) * 0.5 = 2.25

EDIT

In the comments, Ralph wrote:

Tip for those using Visual Studio: you can put something like java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g" in the pre-build events, then you can just modify your grammar and run the project without having to worry about rebuilding the lexer/parser.

149

answered Sep 17 '22 19:09

Bart Kiers

Have you looked at Irony.net? It's aimed at .Net and therefore works really well, has proper tooling, proper examples and just works. The only problem is that it is still a bit 'alpha-ish' so documentation and versions seem to change a bit, but if you just stick with a version, you can do nifty things.

p.s. sorry for the bad answer where you ask a problem about X and someone suggests something different using Y ;^)

answered Sep 20 '22 19:09

Toad

Related questions
                            
                                Can I get command line arguments of other processes from .NET/C#?
                            
                                Auto-increment on partial primary key with Entity Framework Core
                            
                                How do I scroll a RichTextBox to the bottom?
                            
                                WPF Window with transparent background containing opaque controls [duplicate]
                            
                                Performance of calling delegates vs methods
                            
                                Difference between string and StringBuilder in C#
                            
                                Convert Word doc and docx format to PDF in .NET Core without Microsoft.Office.Interop
                            
                                Which unit testing framework? [closed]
                            
                                Calling SignalR hub clients from elsewhere in system
                            
                                Convert 20121004 (yyyyMMdd) to a valid date time?
                            
                                How to pass a null variable to a SQL Stored Procedure from C#.net code
                            
                                How to compare dates in c#
                            
                                How can I get scrollbars on Picturebox
                            
                                How do I force full post-back from a button within an UpdatePanel?
                            
                                What is the best or most interesting use of Extension Methods you've seen? [closed]
                            
                                How can I pass a runtime parameter as part of the dependency resolution?
                            
                                String.Replace(char, char) method in C#
                            
                                LINQ to SQL Where Clause Optional Criteria
                            
                                Razor Views not seeing System.Web.Mvc.HtmlHelper
                            
                                Autofac: Resolve all instances of a Type

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using ANTLR 3.3?

Tags:

c#

parsing

antlr

mpen

People also ask

2 Answers

EDIT

Bart Kiers

Toad

Recent Activity

Donate For Us