Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Valid preprocessor tokens in macro concatenation

I tried to understand the macros in c using the concatenation preprocessor operator ## but I realized that I have problem with tokens. I thought it was easy but in practice it is not.

So the concatenation is for concatenating two tokens to create a new token. ex: concatenating ( and ) or int and *

I tried

#define foo(x,y) x ## y
foo(x,y)

whenever I give it some arguments I get always error saying that pasting both argument does not give a valid preprocessor token.

For instance why concatenating foo(1,aa) results in 1aa (which type of token is it ? and why it is valid) but foo(int,*) I got an error.

Is there a way to know which tokens are valid or is it possible to have some good link to understand how can clarify it in my mind. (I already googled in google and SO)

What am I missing ?

I will be grateful.

like image 485
Sabrina Avatar asked Jan 17 '17 07:01

Sabrina


2 Answers

Preprocessor token concatenation is for generating new tokens, but it is not capable of pasting arbitrary language constructs together (confer, for example, gcc documentation):

However, two tokens that don't together form a valid token cannot be pasted together. For example, you cannot concatenate x with + in either order.

So an attempt at a macro that makes a pointer out of a type like

#define MAKEPTR(NAME)  NAME ## *
MAKEPTR(int) myIntPtr;

is invalid, as int* are two tokens, not one.

The example of above mentioned link, however, shows the generation of new tokens:

 #define COMMAND(NAME)  { #NAME, NAME ## _command }

 struct command commands[] =
 {
   COMMAND (quit),
   COMMAND (help),
   ...
 };

yields:

 struct command commands[] =
 {
   { "quit", quit_command },
   { "help", help_command },
   ...
 };

Token quit_command has not existed before but has been generated through token concatenation.

Note that a macro of the form

#define MAKEPTR(TYPE)  TYPE*
MAKEPTR(int) myIntPtr;

is valid and actually generates a pointer type out of TYPE, e.g. int* out of int.

like image 64
Stephan Lechner Avatar answered Oct 22 '22 07:10

Stephan Lechner


Since it seems to be a point of confusion, the string 1aa is a valid preprocessor token; it is an instance of pp-number, whose definition is (§6.4.8 of the current C standard):

     pp-number:
            digit
            . digit
            pp-number       digit
            pp-number       identifier-nondigit
            pp-number       e sign
            pp-number       E sign
            pp-number       p sign
            pp-number       P sign
            pp-number       .

In other words, a pp-number starts with a digit or a . followed by a digit, and after that it can contain any sequence of digits, "identifier-nondigits" (that is, letters, underscores, and other things which can be part of an identifier) or the letters e or p (either upper or lower-case) followed by a plus or minus sign.

That means that, for example, 0x1e+2 is a valid pp-number, while 0x1f+1 is not (it is three tokens). In a valid program, every pp-number which survives the preprocessing phases must satisfy the syntax of some numeric constant representation, which means that a program which includes the text 0x1e+2 will be considered invalid. The moral, if there is one, is that you should use whitespace generously; it has no cost.

The intention of pp-number is to include everything which might eventually be a number in some future version of C. (Remember that numbers can be followed by alphabetic suffixes indicating types and signedness, such as 27LU).

However, int* is not a valid preprocessor token. It is two tokens (as is -3) and so it cannot be formed with the token concatenation operator.

Another odd consequence of the token-pasting rule is that it is impossible to generate the valid token ... through token concatenation, because .. is not a valid token. (a##b##c must be evaluated in some order, so even if all three preprocessor macros expand to ., there must be an attempt to create the token .., which will fail in most compilers, although I believe Visual Studio accepts it.)

Finally, comment symbols /* and // are not tokens; comments are replaced with whitespace before the separation of the program text into tokens. So you cannot produce a comment with token-pasting either (at least, not in a compliant compiler).

like image 4
rici Avatar answered Oct 22 '22 06:10

rici