Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tool for tracing C preprocessor execution during macro expansion?

Is there a way to print step by step, what the C preprocessor is doing as it expands a macro?

For example, I would give it some C language text (ex: .h file(s)) to preprocess. For sake of demonstration, here's a simple example:

// somefile.h
#define q r
#define bar(x,z) x ## z
#define baz(y) qux ## y
#define foo(x,y) bar(x, baz(y))

So far, that's just to build a table of definitions.

Next comes the text to expand in detail. For this demonstration, I'm expecting the workflow/process/output to be something like this:

$ magical_cpp_revealer  somefile.h

Please enter some preprocessor text to analyse:
> foo(baz(p),q)

Here are the resulting preprocessor calculations:
,----.----.---------------------------.-----------------------------------------
|Step|Exp#|  Expression               |  Reason
|====|====|===========================|=========================================
| 00 | 00 |  foo(baz(p),q)            |  Original tokens.
| 01 |    |                           |  Definition found for 'foo': `foo(x,y)` = "bar(x, baz(y))"
| 02 | 01 |  bar(x, baz(y))           |  'foo' begins expansion. Original tokens shown.
| 03 |    |                           |  'foo' Stage 1: Raw parameter replacements elided: no # or ## operators present.
| 04 |    |                           |  'foo' Stage 2: Stringification elided: no # operators present.
| 05 |    |                           |  'foo' Stage 3: Concatenation elided: no ## operators present.
| 06 |    |                           |  'foo' Stage 4: Argument scan begins.
| 07 |    |                           |    Argument for parameter 'x' is "baz(p)"
| 08 | 02 |    baz(p)                 |    Scanning "baz(p)" for macros to expand.
| 09 |    |                           |    Definition found for 'baz': `baz(y)` = "qux ## y"
| 10 | 03 |    qux ## y               |    'baz' begins expansion. Original tokens shown.
| 11 | 04 |    qux ## p               |      'foo->baz' Stage 1: Raw parameter replacements performed
| 12 |    |                           |         using 'y' = "p".
| 13 |    |                           |      'foo->baz' Stage 2: Stringification elided: no # operators present.
| 14 | 05 |    quxp                   |      'foo->baz' Stage 3: Concatenation performed.
| 15 |    |                           |      'foo->baz' Stage 4: Argument scan elided: no parameters present.
| 16 |    |                           |      'foo->baz' Stage 5: Expansive parameter replacements elided: no parameters present.
| 17 |    |                           |      'foo->baz' Stage 6: Rescan begins
| 18 |    |                           |        No definition for 'quxp'
| 19 |    |                           |      'foo->baz' Stage 6: Rescan concludes.
| 20 | 06 |    quxp                   |    'baz' concludes expansion. Final result shown.
| 21 |    |                           |  'foo' Stage 4: Argument scan continues.
| 22 |    |                           |    Currently:
| 23 |    |                           |      'x' = "quxp"
| 24 |    |                           |      'y' = To Be Determined
| 25 |    |                           |    Argument for parameter 'y' is "q"
| 26 | 07 |    q                      |    Scanning "q" for macros to expand.
| 27 |    |                           |    Definition found for 'q': `q` = "r"
| 28 | 08 |    r                      |    'q' begins expansion. Original tokens shown.
| 29 |    |                           |      'foo->q': Stage 1: Concatenation elided: no ## operators present.
| 30 |    |                           |      'foo->q': Stage 2: Scan begins.
| 31 |    |                           |        No definition for 'r'
| 32 |    |                           |      'foo->q': Stage 2: Scan concludes.
| 33 | 09 |    r                      |    'q' concludes expansion. Final result shown.
| 34 |    |                           |  'foo' Stage 4: Argument scan concludes.
| 35 | 10 |  bar(x, baz(y))           |  'foo': Reminder of current token sequence.
| 36 | 11 |  bar(quxp, baz(r))        |  'foo' Stage 5: Expansive parameter replacements performed
| 37 |    |                           |     using 'x' = "quxp",
| 38 |    |                           |       and 'y' = "r".
| 39 |    |                           |  'foo' Stage 6: Rescan begins
| 40 |    |                           |    Definition found for 'bar': `bar(x,z)` = "x ## z"
| 41 | 12 |    x ## z                 |    'bar' begins expansion. Original tokens shown.
| 42 | 13 |    quxp ## baz(r)         |      'foo->bar' Stage 1: Raw parameter replacements performed
| 43 |    |                           |         using 'x' = "quxp",
| 44 |    |                           |           and 'z' = "baz(r)".
| 45 |    |                           |      'foo->bar' Stage 2: Stringification elided: no # operators present.
| 46 | 14 |    quxpbaz(r)             |      'foo->bar' Stage 3: Concatenation performed.
| 47 |    |                           |      'foo->bar' Stage 4: Argument scan elided: no parameters present.
| 48 |    |                           |      'foo->bar' Stage 5: Expansive parameter replacements elided: no parameters present.
| 49 |    |                           |      'foo->bar' Stage 6: Rescan begins
| 50 |    |                           |        No definition for 'quxpbaz'
| 51 |    |                           |        No definition for '('
| 52 |    |                           |        No definition for 'r'
| 53 |    |                           |        No definition for ')'
| 54 |    |                           |      'foo->baz' Stage 6: Rescan concludes.
| 55 | 15 |    quxpbaz(r)             |    'bar' concludes expansion. Final result shown.
| 56 |    |                           |  'foo' Stage 6: Rescan concludes
| 57 | 16 |  quxpbaz(r)               |  'foo' concludes expansion. Final result shown.
'----'----'---------------------------'-----------------------------------------

(Side note and caveat for future readers: I wrote the above trace by hand and it might not be 100% correct, at least in terms of representing how the preprocessor works.)

Note that I tried to not only illustrate the preprocessor's positive decisions about what what to do (ex: when it's found a definition and starts expanding), but also illustrated its negative decisions about what not to do (ex: when a token has no definition or when #+## operators are not present). That might sound kinda specific, but it's important for understanding why the preprocessor didn't do something that I expected it to do, often with a mundane conclusion along the lines of "I mispelled the definition or the token" or "I forgot to #include that one file".

I'll be even more relieved if there's a way to reveal what MSVC's CL.EXE is thinking when it uses "traditional preprocessor" logic to expand my macros.

Here's an example of what does not answer the question:

$ gcc -E somefile.h
...
quxpbaz(r)

Such is what I find in the answers to questions like Any utility to test expand C/C++ #define macros?.

When someone asks to see the "expansion" of a macro, gcc -E seems like a valid answer. I'm looking for something with higher fidelity, and I already know about gcc -E.

I'm writing ISO C11 code, but am including the C++ tag in case there is a tool or technique in that ecosystem with relevance to this.

I'm hoping someone out there reading this is maybe a compiler writer that has done or seen similar work (compiler tracing options?), or has authored a tool like this, or is just far luckier with their search results than I have been. Or if you keep tabs on all of the C-language offerings out there and are relatively certain this doesn't exist, then I'd find a negative answer to be helpful too, though I'd be curious as to why the C preprocessor would have been around for decades, obtained infamy for its "pitfalls", and yet still never seen a tool (or process) for pulling back the curtain on the preprocessor. (I hope this actually exists. fingers crossed)

like image 823
chadjoan Avatar asked Oct 28 '20 09:10

chadjoan


People also ask

What is macro expansion in preprocessor?

The values 10 and 20 are called macro expansions. When the program run and if the C preprocessor sees an instance of a macro within the program code, it will do the macro expansion. It replaces the macro template with the value of macro expansion.

How does macro expansion work in C?

This macro expands to the name of the current input file, in the form of a C string constant. The precise name returned is the one that was specified in `#include' or as the input file name argument. This macro expands to the current input line number, in the form of a decimal integer constant.

What is the use of preprocessor macro?

Macros allow you to write commonly used PL/I code in a way that hides implementation details and the data that is manipulated and exposes only the operations. In contrast with a generalized subroutine, macros allow generation of only the code that is needed for each individual use.

What is macro preprocessor example?

A macro is a fragment of code that is given a name. You can define a macro in C using the #define preprocessor directive. Here's an example. Here, when we use c in our program, it is replaced with 299792458 .


1 Answers

I would suggest finding a good quality compiler/preprocessor and edit the pre-processor.

I would avoid GCC and clang, as they are too heavy weight IMO. I would have a look at cparser from libfirm and this file in particular: https://github.com/libfirm/cparser/blob/master/src/parser/preprocessor.c

Code from libfirm is super easy to read and edit, and it takes almost no time to build the project - in rough contrast to LLVM/clang or GCC.

It has eaten all C99 code I've thrown at it so far.

By the way I am not affiliated, I just think it rocks! I have just used the code with great results and received fantastic support, help and guidance on the IRC channel #firm @ freenode.

EDIT:

Sparse, as used by the kernel janitors team in Linux, is also easily hackable for such purposes. It includes a c-preprocessor as well: https://github.com/chrisforbes/sparse

https://www.kernel.org/doc/html/v4.12/dev-tools/sparse.html

like image 67
Morten Jensen Avatar answered Oct 22 '22 13:10

Morten Jensen