Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse C++ source in Python? [closed]

We want to parse our huge C++ source tree to gain enough info to feed to another tool to make diagrams of class and object relations, discern the overall organization of things etc.

My best try so far is a Python script that scans all .cpp and .h files, runs regex searches to try to detect class declarations, methods, etc. We don't need a full-blown analyzer to capture every detail, or some heavy UML diagram generator - there's a lot of detail we'd like to ignore and we're inventing new types of diagrams. The script sorta works, but by gosh it's true: C++ is hard to parse!

So I wonder what tools exist for extracting the info we want from our sources? I'm not a language expert, and don't want something with a steep learning curve. Something we low-brow blue-collar programmer grunts can use :P

Python is preferred as one of the standard languages here, but it's not essential.

like image 914
DarenW Avatar asked Feb 02 '11 23:02

DarenW


1 Answers

I'll simply recommend Clang.

It's a C++ library-based compiler designed with ease of reuse in mind. It notably means that you can use it solely for parsing and generating an Abstract Syntax Tree. It takes care of all the tedious operator overloading resolution, template instantiation and so on.

Clang exports a C-based interface, which is extended with Python Bindings. The interface is normally quite rich, but I haven't use it. Anyway, contributions are welcome if you wish to help extending it.

like image 113
Matthieu M. Avatar answered Oct 14 '22 05:10

Matthieu M.