Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating an Abstract Syntax Tree for java source code using ANTLR

How Can I Generate an AST from java src code Using ANTLR?
any help?

like image 680
Aboelnour Avatar asked Feb 05 '12 20:02

Aboelnour


2 Answers

OK, here are the steps:

  1. Go to the ANTLR site and download the latest version
  2. Download the Java.g and the JavaTreeParser.g files from here.
  3. Run the following commands:

    java -jar antlrTool Java.g
    java -jar antlrTool JavaTreeParser.g
    
  4. 5 files will be generated:

    1. Java.tokens
    2. JavaLexer.java
    3. JavaParser.java
    4. JavaTreeParser.g
    5. JavaTreeParser.tokens

use this java code to generate the Abstract Syntax Tree and to print it:

        String input = "public class HelloWord {"+
                   "public void print(String r){" +
                   "for(int i = 0;true;i+=2)" +
                   "System.out.println(r);" +
                   "}" +
                   "}";

    CharStream cs = new ANTLRStringStream(input);
    JavaLexer jl = new JavaLexer(cs);

    CommonTokenStream tokens = new CommonTokenStream();
    tokens.setTokenSource(jl);
    JavaParser jp = new JavaParser(tokens);
    RuleReturnScope result = jp.compilationUnit();
    CommonTree t = (CommonTree) result.getTree();

    CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);

    nodes.setTokenStream(tokens);

    JavaTreeParser walker = new JavaTreeParser(nodes);

    System.out.println("\nWalk tree:\n");

    printTree(t,0);


    System.out.println(tokens.toString());

    }

public static void printTree(CommonTree t, int indent) {
    if ( t != null ) {
        StringBuffer sb = new StringBuffer(indent);
        for ( int i = 0; i < indent; i++ )
            sb = sb.append("   ");
        for ( int i = 0; i < t.getChildCount(); i++ ) {
            System.out.println(sb.toString() + t.getChild(i).toString());
            printTree((CommonTree)t.getChild(i), indent+1);
        }
    }
}
like image 183
Aboelnour Avatar answered Oct 18 '22 22:10

Aboelnour


The setps to generate java src AST using antlr4 are:

  1. Install antlr4 you can use this link to do that.
  2. After installation download the JAVA grammar from here.
  3. Now generate Java8Lexer and Java8Parser using the command:

    antlr4 -visitor Java8.g4

  4. This will generate several files such as Java8BaseListener.java Java8BaseVisitor.java Java8Lexer.java Java8Lexer.tokens Java8Listener.java Java8Parser.java Java8.tokens Java8Visitor.java

Use this code to generate AST:

import java.io.File;
import java.io.IOException;

import java.nio.charset.Charset;
import java.nio.file.Files;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.tree.ParseTree;

public class ASTGenerator {

    public static String readFile() throws IOException {
        File file = new File("path/to/the/test/file.java");
        byte[] encoded = Files.readAllBytes(file.toPath());
        return new String(encoded, Charset.forName("UTF-8"));
    }

    public static void main(String args[]) throws IOException {
        String inputString = readFile();
        ANTLRInputStream input = new ANTLRInputStream(inputString);
        Java8Lexer lexer = new Java8Lexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        Java8Parser parser = new Java8Parser(tokens);
        ParserRuleContext ctx = parser.classDeclaration();

        printAST(ctx, false, 0);
    }

    private static void printAST(RuleContext ctx, boolean verbose, int indentation) {
        boolean toBeIgnored = !verbose && ctx.getChildCount() == 1 && ctx.getChild(0) instanceof ParserRuleContext;

        if (!toBeIgnored) {
            String ruleName = Java8Parser.ruleNames[ctx.getRuleIndex()];
            for (int i = 0; i < indentation; i++) {
                System.out.print("  ");
            }
            System.out.println(ruleName + " -> " + ctx.getText());
        }
        for (int i = 0; i < ctx.getChildCount(); i++) {
            ParseTree element = ctx.getChild(i);
            if (element instanceof RuleContext) {
                printAST((RuleContext) element, verbose, indentation + (toBeIgnored ? 0 : 1));
            }
        }
    }
}

After you are done coding you can use gradle to build your project or you can download antlr-4.7.1-complete.jar in your project directory and start compiling.

If you want a the output in a DOT file so that u can visualise the AST then you can refer to this QnA post or directly refer to this repository in which i have used gradle to build the project.

Hope this helps. :)

like image 32
Satnam Sandhu Avatar answered Oct 18 '22 20:10

Satnam Sandhu