Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse macros in C++ code, using CLANG as the parser and Python as the scripting language?

If I have the following macro in some C++ code:

_Foo(arg1, arg2)

I would like to use Python to find me all the instances and extents of that macro using Clang and the Python bindings provided with cindex.py. I do not want to use a regular expression from Python on the code directly because that gets me 99% of the way there, but not 100%. It appears to me that to get to 100%, you need to use a real C++ parser like Clang to handle all the cases where people do silly things that are syntactically correct and compile, but don't make sense to a regular expression. I need to handle 100% of the cases and since we use Clang as one of our compilers, it makes sense to use it as the parser for this task as well.

Given the following Python code I am able to find what appear to be predefined types that the Clang python bindings know about, but not macros:

def find_typerefs(node):
    ref_node = clang.cindex.Cursor_ref(node)
    if ref_node:
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (
            ref_node.spelling, ref_node.kind, node.data, node.extent, node.location.line, node.location.column)

# Recurse for children of this node
for c in node.get_children():
    find_typerefs(c)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
find_typerefs(tu.cursor)

What I think I am looking for is a way to parse the raw AST for the name of my macro _FOO(), but I am not sure. Can someone provide some code that will allow me to pass in the name of a Macro and get back the extent or data from Clang?

like image 207
warmbeach Avatar asked Apr 11 '12 20:04

warmbeach


1 Answers

You need to pass the appropriate options flag to Index.parse:

tu = index.parse(sys.argv[1], options=clang.cindex.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD)

The rest of the cursor visitor could look like this:

def visit(node):
    if node.kind in (clang.cindex.CursorKind.MACRO_INSTANTIATION, clang.cindex.CursorKind.MACRO_DEFINITION):
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (node.displayname, node.kind, node.data, node.extent, node.location.line, node.location.column)
    for c in node.get_children():
        visit(c)
like image 81
thpani Avatar answered Nov 01 '22 13:11

thpani