Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C#, ANTLR, ECMAScript grammar troubles

Tags:

antlr3

I'm trying to parse JavaScript (ECMASCript) with C#.

I found the following instruction on how to create new project: http://www.antlr.org/wiki/pages/viewpage.action?pageId=557075

So I've downloaded ANTLRWorks, ANTLR v3, unpacked ANTLR, created a VS2010 project (.NET4), added references, checked and generated the grammar.

Then I recieved a lot of compilation error:

The type or namespace name 'AstParserRuleReturnScope' could not be found (are you missing a using directive or an assembly reference?)

The type or namespace name 'GrammarRule' could not be found (are you missing a using directive or an assembly reference?)

Stackoverlowed for them and got a solution: antlr c# errors when integrating into VS2008

So I've downloaded new runtime, overwrite the old one and recompiled the project and got

The name 'HIDDEN' does not exist in the current context d:\Workspace.1\ScriptParser\ScriptParser\TestLexer.cs

Ok, I've changed HIDDEN to Hidden as recommended at in the following conversation: [antlr-interest] How viable is the Csharp3 target? (more specific questions)

Now I'm trying to parse the input. I found a few examples and wrote the following code:

using Antlr.Runtime;
namespace ScriptParser
{
    class Program
    {
        static void Main(string[] args)
        {
            var stream = new ANTLRStringStream("1+2");
            var lexer = new TestLexer(stream);
            var tokenStream = new CommonTokenStream(lexer);
            var parser = new TestParser(tokenStream);
            // what exactly should be here???
        }
    }
}

My goal is to parser JavaScript file with ANTLR but it seems that it will be the not as easy as I thought...

Update:

As suggested in Why are antlr3 c# parser methods private? I've modified the Test.g grammar by adding the "public" modified before the expr rule:

public expr : mexpr (PLUS^ mexpr)* SEMI! 
; 

and then regenerated the code, replaced HIDDEN to Hidden (again) and modified the code as follows:

var stream = new ANTLRStringStream("1+2");
var lexer = new TestLexer(stream);
var tokenStream = new CommonTokenStream(lexer);
var parser = new TestParser(tokenStream);
var result = parser.expr();
var tree = (CommonTree)result.Tree;

And not it is crashing on the line

root_0 = (object)adaptor.Nil(); 

in the following generated code

try { DebugEnterRule(GrammarFileName, "expr");
DebugLocation(7, 0);
try
{
    // d:\\Workspace.1\\ScriptParser\\ScriptParser\\Test.g:7:13: ( mexpr ( PLUS ^ mexpr )* SEMI !)
    DebugEnterAlt(1);
    // d:\\Workspace.1\\ScriptParser\\ScriptParser\\Test.g:7:15: mexpr ( PLUS ^ mexpr )* SEMI !
    {
    root_0 = (object)adaptor.Nil(); 

    DebugLocation(7, 15);
    PushFollow(Follow._mexpr_in_expr31);

with the NullReferenceException message because the adapter is null.

I've resolved it by adding

parser.TreeAdaptor = new CommonTreeAdaptor();

Update 2:

So, finally I've started with my primary task: parse JavaScript.

ANTLR highlights the ECMAScript grammar by Chris Lambrou.

So I've generated lexer/parser and run it with the very simple JavaScript code:

var f = function () { };

and the parsing fails with the following output from tree.ToStringTree():

<error: var q = function () { };>
like image 413
Alex Netkachov Avatar asked Feb 08 '12 17:02

Alex Netkachov


1 Answers

Your grammar rule says that there should be a semicolon at the end of the expression, but in you main function:

var stream = new ANTLRStringStream("1+2");

is missing a semicolon. Shouldn't it be "1+2;"?

like image 109
Isaac Betesh Avatar answered Sep 22 '22 01:09

Isaac Betesh