Is #
permitted in an object-like macro, and if so, what happens?
The C standard only defines the behaviour of #
in a macro for function-like macros.
Sample code:
#include <stdio.h>
#define A X#Y
#define B(X) #X
#define C(X) B(X)
int main()
{
printf(C(A) "\n");
}
gcc outputs X#Y
, suggesting that it permits #
to be present and performs no special processing. However, since the definition of the #
operator does not define the behaviour in this case, is it actually undefined behaviour?
As you noticed, #
only has a defined effect in function-like macros. § 6.10.3.2/1 (all references to the standard are to the C11 draft (N1570)). To see what happens in object-like macros, we must look elsewhere.
A preprocessing directive of the form
# define identifier replacement-list new-line
defines an object-like macro that causes each subsequent instance of the macro name to be replaced by the replacement list of preprocessing tokens that constitute the remainder of the directive. [...]
§ 6.10.3/9
Therefore, the only question is whether #
is allowed in a replacement-list
. If so, it takes part in replacement as usual.
We find the syntax in § 6.10/1:
replacement-list:
pp-tokens (opt.)
pp-tokens:
preprocessing-token
pp-tokens preprocessing-token
Now, is #
a valid preprocessing-token
? § 6.4/1 says:
preprocessing-token:
header-name
identifier
pp-number
character-constant
string-literal
punctuator
each non-white-space character that cannot be one of the above
It's certainly not a header-name
(§ 6.4.7/1), it's not allowed in identifier
tokens (§ 6.4.2.1/1), nor is it a pp-number
(which is basically any number in an allowed format, § 6.4.8/1), nor a character-constant
(such as u'c'
, § 6.4.4.4/1) or a string-literal
(exactly what you'd expect, e.g. L"String"
, § 6.4.5/1).
However, it is listed as a punctuator
in § 6.4.6/1. Therefore, it is allowed in the replacement-list
of an object-like macro and will be copied verbatim. It is now subject to rescanning as described in § 6.10.3.4. Let us look at your example:
C(A)
will be replaced with C(X#Y)
. #
has no special effect here, because it is not in the replacement-list
of C
, but its argument. C(X#Y)
is obviously turned into B(X#Y)
. Then B
's argument is turned into a string literal via the #
operator in B
's replacement-list
, yielding "X#Y"
Therefore, you don't have undefined behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With