Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expansion of function-like macro creates a separate token

Tags:

c

macros

c99

I just found out that gcc seems to treat the result of the expansion of a function-like macro as a separate token. Here is a simple example showing the behavior of gcc:

#define f() foo
void f()_bar(void);
void f()bar(void);
void f()-bar(void);

When I execute gcc -E -P test.c (running just the preprocessor), I get the following output:

void foo _bar(void);
void foo bar(void);
void foo-bar(void);

It seems like, in the first two definitions, gcc inserts space after the expanded macro to ensure it is a separate token. Is that really what is happening here?

Is this mandated by any standard (I couldn't find documentation on the topic)?

I want to make _bar part of the same token. Is there any way to do this? I could use the token concatenation operator ## but it will require several levels of macros (since in the real code f() is more complex). I was wondering if there is a simple (and probably more readable) solution.

like image 271
martinkunev Avatar asked Aug 25 '15 15:08

martinkunev


2 Answers

It seems like, in the first two definitions, gcc inserts space after the expanded macro to ensure it is a separate token. Is that really what is happening here?

Yes.

Is this mandated by any standard (I couldn't find documentation on the topic)?

Yes, although an implementation would be allowed to insert even more than one whitespace to separate the tokens.

f()_bar

here you have 4 tokens after lexical analysis (they are actually pre-processor tokens at this stage but let's call them tokens): f, (, ) and _bar.

The function-like macro replacement semantic (as defined in C11, 6.10.3) has to replace the 3 token f, (, ) into a new one foo. It is not allowed to work on other tokens and change the last _bar token. For this the implementation has to insert at least one whitespace to preserve _bar token. Otherwise the result would have been foo_bar which is a single token.

gcc preprocessor somewhat documents it here:

Once the input file is broken into tokens, the token boundaries never change, except when the ‘##’ preprocessing operator is used to paste tokens together. See Concatenation. For example,

#define foo() bar
foo()baz
     ==> bar baz
not
     ==> barbaz

In the other case, like f()-bar, there 5 tokens: f, (, ), - and bar. (- is a punctuator token in C whereas _ in _bar is simply a character of the identifier token). The implementation does not have to insert token separator (as whitespace) here as after macro replacement -bar are still considered as two separate tokens from C syntax.

gcc preprocessor (cpp) does not insert whitespace here simply because it does not have to. In cpp documentation, on token spacing it is written (on a different issue):

However, we would like to keep space insertion to a minimum, both for aesthetic reasons and because it causes problems for people who still try to abuse the preprocessor for things like Fortran source and Makefiles.

I didn't address the solution to your issue in this answer, but I think you have to use operator explicitly specified to concatenate tokens: the ## token pasting operator.

like image 115
ouah Avatar answered Oct 08 '22 22:10

ouah


The only way I can think of (if you can not use the token concatenation operator ##) is using the traditional (pre-standard) C preprocessing:

gcc -E -P -traditional-cpp test.c

Output:

void foo_bar(void);
void foobar(void);
void foo-bar(void);

More info

like image 25
David Ranieri Avatar answered Oct 08 '22 23:10

David Ranieri