Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting preprocessor symbols from source

I'm looking for a way to extract all preprocessor symbols used in my code.
As an example, if my code looks like this:

#ifdef FOO
#endif

#if ( BAR == 1 && \
      defined (Z) )
#endif

I'd like to get the list [FOO,BAR,Z] as the output.

I found some posts suggesting gcc -E -dM, but this displays all symbols that the preprocessor would apply to the code.
What I want, in contrast, is a list of all symbols actually used in the code.

Any suggestions?

like image 338
Michael Wahler Avatar asked Mar 23 '16 14:03

Michael Wahler


1 Answers

That's quite simple. You have just to parse the source code exactly the way a conformant pre-processor would, and with the correct C or C++ version support. Ok, I'm joking, if you support only the later version, your code is likely to produce correct results on older versions - but even this should be thoroughly controlled.

More seriously now. As you can ask the pre-processor to give you the list of all defined symbols, you can simply tokenize the source, and identify all tokens from that list that are not immediately following an initial #define or #undef. This part should be reasonably feasable with lex+yacc.

The only alternative I can imagine would be to use the code of a real compiler (Clang should be easier than gcc but unsure) discard all code generation and consistently store every macro usage.

TL/DR: however you take it, it will be a hard work: if you can do without, keep away from that...

like image 140
Serge Ballesta Avatar answered Oct 14 '22 00:10

Serge Ballesta