Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I get an XML AST dump of C/C++ code with clang without using the compiler?

I managed to compile successfully clang for windows with cmake and visual studio 10. I would like to get an XML file as AST representation of the source code. There is one option that provides the result with clang with gcc under linux (ubuntu) but doesn't work on the windows box:

clang -cc1 -ast-print-xml source.c

However, this is invoking the compilation stage (which I would like to avoid). Digging in the source code didn't help me so far as I am quite new to clang. I could manage to generate the binary version of the AST by using:

clang -emit-ast source.c

Unfortunately, this format is unusable directly for parsing. Is there some existing method to directly generate the XML tree instead of a binary one in clang?

The goal is to use the XML representation in other tools in the .NET environment so I would need to make some wrapping around the native clang lib to access the binary AST. Maybe there is a third option if someone already wrote some binary clang AST parser for .NET?

Is it possible that I am missing something like if the AST generated by the clang front end is not equivalent to the one generated in the compilation stage.

like image 985
jdehaan Avatar asked Mar 15 '11 08:03

jdehaan


People also ask

What is clang AST?

Clang's AST is different from ASTs produced by some other compilers in that it closely resembles both the written C++ code and the C++ standard. For example, parenthesis expressions and compile time constants are available in an unreduced form in the AST. This makes Clang's AST a good fit for refactoring tools.

What is clang query?

clang-query is a tool that allows you to quickly iterate and develop the difficult part of a matcher. Once the design of the matcher is completed, it can be transferred to a C++ clang-tidy plugin, similar to the ones in mozilla-central.


3 Answers

For your information, the XML printer has been removed from the 2.9 version by Douglas Gregor (responsible of CLang FrontEnd).

The issue was that the XML printer was lacking. A number of the AST nodes had never been implemented in the printer, as well as a number of the properties of some nodes, which led to an inaccurate representation of the source code.

Another point raised by Douglas was that the output should be suitable not for debugging CLang itself (which is what the -emit-ast is about) but for consumption by external tools. This requires the output to be stable from one version to another. Notably it should not be a 1-on-1 mapping of CLang internal, but rather translate the source code into standarized language.

Unless there is significant work on the printer (which requires volunteers) it will not be integrated back...

like image 143
Matthieu M. Avatar answered Oct 19 '22 08:10

Matthieu M.


I've been working on my own version of extracting XML from Clang's AST. My code uses the Python bindings of libclang in order to traverse the AST.

My code is found at https://github.com/BentleyJOakes/PCX

Edit: I should add that it is quite incomplete in terms of producing the right source code tokens for each AST node. This unfortunately needs to be coded for each AST node type. However, the code should give a basis for anyone who wants to pursue this further.

like image 33
user3697676 Avatar answered Oct 19 '22 08:10

user3697676


Using a custom ASTDumper would do the job, without ofc compiling any source file. (stop clang in the frontend part). but you have to deal with all C and C++ code sources of llvm to accomplish that .

like image 1
issamux Avatar answered Oct 19 '22 07:10

issamux