While I was working on a big project full of macro tricks and wizardry, I stumbled upon a bug in which a macro was not expanding properly. The resulting output was "EXPAND(0)
", but EXPAND
was defined as "#define EXPAND(X) X
", so clearly the output should have been "0
".
"No problem", I thought to myself. "It's probably some silly mistake, there are some nasty macros here, after all, plenty of places to go wrong". As I thought that, I isolated the misbehaving macros into their own project, about 200 lines, and started working on a MWE to pinpoint the problem. 200 lines became 150, which in turn became 100, then 20, 10... To my absolute shock, this was my final MWE:
#define EXPAND(X) X #define PARENTHESIS() () #define TEST() EXPAND(0) EXPAND(TEST PARENTHESIS()) // EXPAND(0)
4 lines.
To add insult to injury, almost any modification to the macros will make them work correctly:
#define EXPAND(X) X #define PARENTHESIS() () #define TEST() EXPAND(0) // Manually replaced PARENTHESIS() EXPAND(TEST ()) // 0
#define EXPAND(X) X #define PARENTHESIS() () #define TEST() EXPAND(0) // Manually replaced TEST() EXPAND(EXPAND(0)) // 0
// Set EXPAND to 0 instead of X #define EXPAND(X) 0 #define PARENTHESIS() () #define TEST() EXPAND(0) EXPAND(TEST PARENTHESIS()) // 0
But most importantly, and most oddly, the code below fails in the exact same way:
#define EXPAND(X) X #define PARENTHESIS() () #define TEST() EXPAND(0) EXPAND(EXPAND(EXPAND(EXPAND(TEST PARENTHESIS())))) // EXPAND(0)
This means the preprocessor is perfectly capable of expanding EXPAND
, but for some reason, it absolutely refuses to expand it again in the last step.
Now, how I'm going to solve this problem in my actual program is neither here nor there. Although a solution would be nice (i.e. a way to expand the token EXPAND(TEST PARENTHESIS())
to 0
), the thing I'm most interested in is: why? Why did the C preprocessor come to the conclusion that "EXPAND(0)
" was the correct expansion in the first case, but not in the other ones?
Although it's easy to find resources on what the C preprocessor does (and some magic that you can do with it), I've yet to find one that explains how it does it, and I want to take this opportunity to understand better how the preprocessor does its job and what rules it uses when expanding macros.
So in light of that: What is the reasoning behind the preprocessor's decision to expand the final macro to "EXPAND(0)
" instead of "0
"?
Edit: After reading Chris Dodd's very detailed, logical and well-put answer, I did what anybody would do in the same situation... try to come up with a counterexample :)
What I concocted was this different 4-liner:
#define EXPAND(X) X #define GLUE(X,Y) X Y #define MACRO() GLUE(A,B) EXPAND(GLUE(MACRO, ())) // GLUE(A,B)
Now, knowing the fact that the C preprocessor is not Turing complete, there is no way the above will ever expand to A B
. If that were the case, GLUE
would expand MACRO
and MACRO
would expand GLUE
. That would lead to the possibility of unlimited recursion, probably implying Turing Completeness for the Cpp. So sadly for the preprocessor wizards out there, the above macro not expanding is a guarantee.
It failing is not really the problem, the real problem is: Where? Where did the preprocessor decide to stop the expansion?
Analyzing the steps:
EXPAND
and scans in argument list GLUE(MACRO, ())
for X
GLUE(MACRO, ())
as a macro: MACRO
and ()
as argumentsMACRO ()
GLUE
and scans MACRO ()
for macros, finding MACRO
GLUE(A,B)
GLUE(A,B)
for macros, finding GLUE
. It is suppressed, however, so it leaves as is.X
after step 2 is GLUE(A,B)
(notice that since we are not in step 4 of GLUE
, in theory, it is not suppressed anymore)GLUE(A,B)
EXPAND
and scans GLUE(A,B)
for more macros, finding GLUE
(uuh) A
and B
for the arguments (oh no)A B
(well...)A B
for macros, but finds nothingA B
Which would be our dream. Sadly, the macro expands to GLUE(A,B)
.
So our question is: Why?
This macro expands to the name of the current input file, in the form of a C string constant. The precise name returned is the one that was specified in `#include' or as the input file name argument. This macro expands to the current input line number, in the form of a decimal integer constant.
The C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs.
The C preprocessor is a macro preprocessor (allows you to define macros) that transforms your program before it is compiled. These transformations can be the inclusion of header files, macro expansions, etc. All preprocessing directives begin with a # symbol. For example, #define PI 3.14.
File InclusionThis type of preprocessor directive tells the compiler to include a file in the source code program. There are two types of files that can be included by the user in the program: Header files or Standard files: These files contain definitions of pre-defined functions like printf(), scanf(), etc.
Now, knowing the fact that the C preprocessor is not Turing complete, there is no way the above will ever expand to A B. If that were the case, GLUE would expand MACRO and MACRO would expand GLUE.
The macro expansion is the most common and popular C preprocessor directives. Before you begin, learn a bit about the preprocessor directives. If you are familiar with the concept, skip ahead. A macro is a small piece of code or constant value that is replaced with C source code before the execution of a C program.
These directive are of four types – macro expansion, file inclusion, conditional compilation ,and other miscellaneous directives. The macro expansion is the most common and popular C preprocessor directives. Before you begin, learn a bit about the preprocessor directives. If you are familiar with the concept, skip ahead.
Now, knowing the fact that the C preprocessor is not Turing complete, there is no way the above will ever expand to A B. If that were the case, GLUE would expand MACRO and MACRO would expand GLUE. That would lead to the possibility of unlimited recursion, probably implying Turing Completeness for the Cpp.
Macro expansion is a complex process that is really only understandable by understanding the steps that occur.
When a macro with arguments is recognized (macro name token followed by (
token), the following tokens up to the matching )
are scanned and split (on ,
tokens). No macro expansion happens while this is happening (so the ,
s and )
must be present in the input stream directly and cannot be in other macros).
Each macro argument whose name appears in the macro body not preceeded by #
or ##
or followed by ##
is "prescanned" for macros to expand -- any macros entirely within the argument will be recursively expanded before substituting into the macro body.
The resulting macro argument token streams are substituted into the body of the macro. Arguments involved in #
or ##
operations are modified (stringized or pasted) and substituted based on the original parser tokens from step 1 (step 2 does not occur for these).
The resulting macro body token stream is scanned again for macros to expand, but ignoring the macro currently being expanded. At this point further tokens in the input (after what was scanned and parsed in step 1) may be included as part of any macros recognized.
The important thing is that there are TWO DIFFERENT recursive expansions that occur (step 2 and step 4 above) and ONLY the one in step 4 ignores recursive macro expansions of the same macro. The recursive expansion in step 2 DOES NOT ignore the current macro, so can expand it recursively.
So for your example above, lets see what happens. For the input
EXPAND(TEST PARENTHESIS())
EXPAND
and scans in argument list TEST PARENTHESIS()
for X
TEST
as a macro (no following (
), but does recognize PARENTHESIS
: ()
yielding just that: ()
()
for macros and doesn't find anyX
after step 2 is TEST ()
TEST ()
EXPAND
and scans the result of step 3 for more macros, finding TEST
EXPAND(0)
TEST
. At this point, both EXPAND
and TEST
are suppressed (due to being in the step 4 expansion), so nothing happensYour other example EXPAND(TEST())
is different
EXPAND
is recognized as a macro, and TEST()
is parsed as the argument X
EXPAND
is NOT SUPPRESSED TEST
is recognized as a macro with an empty sequence argumentEXPAND(0)
TEST
is suppressed and the result recursively expanded EXPAND
is recognized as a macro (remember, at this point only TEST
is suppressed by step 4 recursion -- EXPAND
is in the step 2 recursion so is not suppressed) with 0
as its argument0
is scanned and nothing happens to it0
0
is scanned again for macros (and again nothing happens)0
is substituted as the argument X
into the body of the first EXPAND
0
is scanned again for macros (and again nothing happens)so the final result here is 0
For the purposes of this situation, there are three relevant steps in macro replacement:
In EXPAND(TEST PARENTHESIS())
:
EXPAND
, TEST PARENTHESIS()
: TEST
is not followed by parentheses, so it is not interpreted as a macro invocation.PARENTHESIS()
is a macro invocation, so the three steps are performed: The arguments are empty, so there is no processing for them. Then PARENTHESIS()
is replaced by ()
. Then ()
is rescanned and no macros are found.EXPAND(TEST ())
. (TEST ()
is not rescanned because it was not the result of any macro replacement.)EXPAND(TEST ())
is replaced with TEST ()
.TEST ()
is rescanned while suppressing EXPAND
: TEST ()
is replaced by EXPAND(0)
.EXPAND(0)
is rescanned, but EXPAND
is suppressed.In EXPAND(TEST ())
:
EXPAND
: TEST
are empty, so there is no processing.TEST ()
is replaced by EXPAND(0)
.EXPAND(0)
is replaced by 0
.EXPAND(TEST ())
has become EXPAND(0)
, and EXPAND(0)
is replaced by 0
.0
is rescanned for further macros, but there are none.The other examples in the question follow similarly. It comes down to:
TEST PARENTHESIS()
, the lack of parentheses after TEST
results in it not being expanded while processing arguments to an enclosing macro invocation.PARENTHESIS
is expanded, but this is after TEST
was scanned, and it is not rescanned during processing of the argument.TEST
is rescanned and is replaced then, but, at this time, the enclosing macro’s name is suppressed.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With