Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Picking out symbols from a code base with Python

Tags:

python

parsing

Given a code base (say for example a large C or Objective-C project) I would like to analyze the sourcecode files and pick out symbols of interest. They might be class declarations, variable names or types, or method names. Is there a Python module that could help me with this?

The only approach I can see going forward is to use regular expressions to gather these symbols, but I'm thinking this could get very ugly very quickly. I'm also not an expert in compilers or parsers, so something lighter-weight would be prefereable.

thanks for any suggestions.

------ update -----

thanks for all of the suggestions so far, definitely some promising leads. One other avenue that may be possible: what if I were able to compile the project I was trying to analyze. Would the debugging symbols (dsym) make this process any easier? I'm not looking for anything advanced, just a list of classes, with their ivar and method names. At this point, looking into the parsing tools suggested seem like more work than I can afford to invest in this project right now

like image 554
D.C. Avatar asked Oct 13 '10 00:10

D.C.


1 Answers

Regex is definitely not a good way to examine programming language code. I would suggest choosing a parsing module from the links provided below. There are a few tools out there that you could use. They all provide parsing facility. You can always build your stuff on top of that:

  • http://code.google.com/p/pycparser/
  • Many tools at : http://wiki.python.org/moin/LanguageParsing
  • http://www.boost.org/doc/libs/1_44_0/libs/python/pyste/doc/introduction.html

pygccxml generates xml description from c++ program files. This might be closer to what you are trying to do:

  • http://www.language-binding.net/pygccxml/pygccxml.html

Also look at this, it generate navigable class tree representing the class structure.

  • http://sourceforge.net/projects/cppheaderparser/
like image 62
pyfunc Avatar answered Oct 03 '22 20:10

pyfunc