Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse/simple analyze C/C++ code from C# to get a list of methods

Tags:

c++

parsing

clang

I need to go through a C/C++ file and extract the list of classes and methods and where they're located on the file.

Is libclang the best option? Or is it "too much" for the task?

Would it be better to just look for pairing brackets?

In case libclang is the choice: is there a way to invoke it from c#?

Thanks!

like image 204
pablo Avatar asked Jan 11 '12 10:01

pablo


2 Answers

You could consider ctags, available on many platforms. The output is easily parsable, and full of info you required.

more info For your question, I had to look to the many options available, and after a little I found it. For example:

ctags -N -x --c-kinds=+p crowd.*

produces this output

CrowdSim         class        44 crowd.h          class CrowdSim
CrowdSim         function     47 crowd.h          CrowdSim( const std::string& contentDir ) : _contentDir( contentDir ) {}
Particle         function     35 crowd.h          Particle()
Particle         struct       25 crowd.h          struct Particle
_contentDir      member       56 crowd.h          std::string _contentDir;
_crowd_H_        macro        18 crowd.h          #define _crowd_H_
_particles       member       57 crowd.h          std::vector< Particle > _particles;
animTime         member       32 crowd.h          float animTime;
chooseDestination function     24 crowd.cpp        void CrowdSim::chooseDestination( Particle &p )
chooseDestination prototype    53 crowd.h          void chooseDestination( Particle &p );
dx               member       28 crowd.h          float dx, dz; // Destination position
dz               member       28 crowd.h          float dx, dz; // Destination position
fx               member       29 crowd.h          float fx, fz; // Force on particle
fz               member       29 crowd.h          float fx, fz; // Force on particle
init             function     35 crowd.cpp        void CrowdSim::init()
init             prototype    49 crowd.h          void init();
node             member       31 crowd.h          H3DNode node;
ox               member       30 crowd.h          float ox, oz; // Orientation vector
oz               member       30 crowd.h          float ox, oz; // Orientation vector
px               member       27 crowd.h          float px, pz; // Current postition
pz               member       27 crowd.h          float px, pz; // Current postition
update           function     68 crowd.cpp        void CrowdSim::update( float fps )
update           prototype    50 crowd.h          void update( float fps );

(note: -x is only for easy user inspection)

like image 82
CapelliC Avatar answered Sep 23 '22 09:09

CapelliC


To do this well, you really need something that contains a full C++ parser.

Our DMS Software Reengineering Toolkit with its C++ Front End could be used for this. It can provide both the precise entity declarations including types, and their context (class/namespace/...) and precise file positions. DMS provides access to all this inforamtion as a set of ASTs and related symbol tables; you build custom code to navigate to/take what you want.

Depending on your needs, you may find that the information you want is difficult to process using vanilla C#. The type information in its full glory is pretty complex, because C++ is a complex language. If you want to process that information, you'll want to "stay inside" DMS where all the machinery to do that is present. If all you want is the names and type information as text strings, you can get DMS to prettyprint this data in that form; it has standard libraries supporting such activities. An intermediate answer would be to export the data in XML format; DMS provides direct support for exporting arbitrary AST fragments but only indirect support for writing type information out as XML, but it wouldn't be hard to customize.

EDIT: (in response to OP comment in another answer) DMS can provide precise information both about the method signature, and the method body. It has full AST and type information for both.

like image 31
Ira Baxter Avatar answered Sep 23 '22 09:09

Ira Baxter