Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating java code parse trees and evaluating it for testing

We have a need to generate Java source code. We do this by modeling the abstract syntax tree and have a tree walker that generate the actual source code text. This far all good.

Since my AST code is a bit old, it does not have support for annotations and generics. So I'm looking around for open projects to use for future projects with code generation needs. And this is where the actual issue comes. We want to test that the code generated has the correct behavior.

Here is where I got the idea to actually evaluate the AST instead of generating the java source code, compile it, and run tests against that code. An evaluator would speed up the unit tests, and one could evaluate smaller pieces of generated code, such as only a method, making the "units" more reasonable.

So far i have found the com.sun.codemodel project that seems quite nice as for being a modern (support for java5 and 6 features) AST based code-generating solution.

Anyone know if there is another project that would allow me to evaluate pieces of AST directly (such as a single generated method)?

like image 654
Christian Avatar asked Sep 16 '09 09:09

Christian


People also ask

What is parse tree in Java?

A parse tree is a representation of the code closer to the concrete syntax. It shows many details of the implementation of the parser. For instance, usually rules correspond to the type of a node. They are usually transformed in AST by the user, with some help from the parser generator.

What is parse tree with example?

A parse tree is made up of nodes and branches. In the picture the parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, ball, the, hit). In a parse tree, each node is either a root node, a branch node, or a leaf node.

How do you construct a parse tree of a mathematical expression?

The first step in building a parse tree is to break up the expression string into a list of tokens. There are four different kinds of tokens to consider: left parentheses, right parentheses, operators, and operands.


1 Answers

To evaluate Java, you need all the semantic analysis that goes along with it ("what is the scope of this identifier? What type does it have?") as well as an interpreter.

To get that semantic analysis, you need more than just an AST: you need full name resolution (symbol table building) and type resolution (determination of expression types and validation that expressions are valid in the context in which they are found), as well as class lookup (which actual method does foo refer to?)

With that, you can consider building an interpreter by crawling over the trees in execution order. You'll also need to build a storage manager; you might not need to do a full garbage collector, but you'll need something. You'll also need an interpreter for .class files if you really want to run something, which means you need a parser (and name/type resolution for the class files, too).

I don't know if Eclipse has all this (at least the storage manager part you can get for free :). I'd sort of expect it to, given that its original design was to support Java development, but I've been sorely disappointed by lots of tools over the years.

The DMS Software Reengineering Toolkit is a program analysis/transformation too that handles many languages. It has a full Java front end including parsing, AST building, symbol table construction and name resolution, type resolution, builds call graphs (needed to resolve virtual function calls), and has a .class file reader to boot with name resolution. So it would be a good foundation for building an interpreter.

DMS can construct arbitrary ASTs, too, and then generate source code from them, so it would handle the code generation end, too, just fine.

[The reason DMS exists is the "sorely disappointed" part].

like image 118
Ira Baxter Avatar answered Sep 30 '22 12:09

Ira Baxter