I'm looking for a way to extract all preprocessor symbols used in my code.
As an example, if my code looks like this:
#ifdef FOO
#endif
#if ( BAR == 1 && \
defined (Z) )
#endif
I'd like to get the list [FOO,BAR,Z]
as the output.
I found some posts suggesting gcc -E -dM
, but this displays all symbols that the preprocessor would apply to the code.
What I want, in contrast, is a list of all symbols actually used in the code.
Any suggestions?
That's quite simple. You have just to parse the source code exactly the way a conformant pre-processor would, and with the correct C or C++ version support. Ok, I'm joking, if you support only the later version, your code is likely to produce correct results on older versions - but even this should be thoroughly controlled.
More seriously now. As you can ask the pre-processor to give you the list of all defined symbols, you can simply tokenize the source, and identify all tokens from that list that are not immediately following an initial #define or #undef. This part should be reasonably feasable with lex+yacc.
The only alternative I can imagine would be to use the code of a real compiler (Clang should be easier than gcc but unsure) discard all code generation and consistently store every macro usage.
TL/DR: however you take it, it will be a hard work: if you can do without, keep away from that...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With