Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I get an XML AST of C/C++/Java code without compiling it?

I want to create an XML file with AST representation of source code, but without compiling it. I didn't find any sufficient solution so far. Here is what I tried:

  • Using XML printer in clang - clang -cc1 -ast-print-xml - it would be nice, but it was removed from clang
  • srcML toolkit, which in theory works well, but has poor parser (for Java it's not even fully 1.5-compatible)

Are there any other alternatives?

like image 855
Kao Avatar asked Aug 26 '14 09:08

Kao


People also ask

What is an AST in Java?

The Abstract Syntax Tree (AST) As mentioned, the Abstract Syntax Tree is the way that Eclipse looks at your source code: every Java source file is entirely represented as tree of AST nodes. These nodes are all subclasses of ASTNode . Every subclass is specialized for an element of the Java Programming Language.

What is AST compiler?

An AST is usually the result of the syntax analysis phase of a compiler. It often serves as an intermediate representation of the program through several stages that the compiler requires, and has a strong impact on the final output of the compiler. ASTs are also used for uses cases like static code analysis.

Why use Abstract Syntax Tree?

Code Generation Once we have an Abstract Syntax Tree we can both manipulate it as well as "print" it into a different type of code. Using ASTs to manipulate code is safer than doing those operations directly on the code as text or on a list of tokens.


1 Answers

For Java, see What would an AST (abstract syntax tree) for an object-oriented programming language look like?

For C, see get human readable AST from c++ code

Both of these are produced by one engine: our DMS Software Reengineering Toolkit. DMS also has a full C++11 parser that can produce similar XML. (EDIT Jan 2016: now full C++ 14 for GCC and Visual C++).

I don't think XML is really a good idea: it is enormous and klunky, and the analysis tools you can bring to bear on it are ... what? XSLT: That's not very useful for analyzing programs. Read the XML into a DOM and climb over that? You'll find that you are missing lots of useful support (symbol tables, etc.); AST's are just not enough. See my essay on Life After Parsing (check my bio or google).

You are better off using a set of integrated machinery that provides all kinds of consistent support for analyzing (multiple) programming languages (using the ASTs as a foundation). This is what DMS is designed to do.

like image 124
Ira Baxter Avatar answered Oct 03 '22 06:10

Ira Baxter