Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Preprocessor Standard Behaviour

I'm studying the C++ standard on the exact behaviour the preprocessor (I need to implement some sort of C++ preprocessor). From what I understand, the example I made up (to aid my understanding) below should be valid:

#define dds(x) f(x,
#define f(a,b) a+b
dds(eoe)
su)

I expect the first function like macro invocation dds(eoe) be replaced by f(eoe, (note the comma within the replacement string) which then considered as f(eoe,su) when the input is rescanned.

But a test with VC++2010 gave me this (I told the VC++ to output the preprocessed file):

eoe+et_leoe+et_l
su)

This is counter-intuitive and is obviously incorrect. Is it a bug with VC++2010 or my misunderstanding of the C++ standard? In particular, is it incorrect to put a comma at the end of the replacement string like I did? My understanding of the C++ standard grammar is that any preprocessing-token's are allowed there.

EDIT:

I don't have GCC or other versions of VC++. Could someone help me to verify with these compilers.

like image 914
JavaMan Avatar asked Mar 23 '14 11:03

JavaMan


People also ask

What is the role of C preprocessor?

The C preprocessor is a macro preprocessor (allows you to define macros) that transforms your program before it is compiled. These transformations can be the inclusion of header files, macro expansions, etc.

What does ## mean in C preprocessor?

The double-number-sign or token-pasting operator (##), which is sometimes called the merging or combining operator, is used in both object-like and function-like macros. It permits separate tokens to be joined into a single token, and therefore, can't be the first or last token in the macro definition.

What are the types of C preprocessor?

There are 4 Main Types of Preprocessor Directives:Macros. File Inclusion. Conditional Compilation. Other directives.


1 Answers

My answer is valid for the C preprocessor, but according to Is a C++ preprocessor identical to a C preprocessor?, the differences are not relevant for this case.

From C, A Reference Manual, 5th edition:

When a functionlike macro call is encoutered, the entire macro call is replaced, after parameter processing, by a copy of the body. Parameter processing proceeds as follows. Actual argument token strings are associated with the corresponding formal parameter names. A copy of the body is then made in which every occurrence of a formal parameter name is replace by a copy of the actual parameter token sequence associated with it. This copy of the body then replaces the macro call. [...] Once a macro call has been expanded, the scan for macro calls resumes at the beginning of the expansion so that names of macros may be recognized within the expansion for the purpose of further macro replacement.

Note the words within the expansion. That's what makes your example invalid. Now, combine it with this: UPDATE: read comments below.

[...] The macro is invoked by writing its name, a left parenthesis, then once actual argument token sequence for each formal parameter, then a right parenthesis. The actual argument token sequences are separated by commas.

Basically, it all boils down to whether the preprocessor will rescan for further macro invocations only within the previous expansion, or if it will keep reading tokens that show up even after the expansion.

This may be hard to think about, but I believe that what should happen with your example is that the macro name f is recognized during rescanning, and since subsequent token processing reveals a macro invocation for f(), your example is correct and should output what you expect. GCC and clang give the correct output, and according to this reasoning, this would also be valid (and yield equivalent outputs):

#define dds f
#define f(a,b) a+b

dds(eoe,su)

And indeed, the preprocessing output is the same in both examples. As for the output you get with VC++, I'd say you found a bug.

This is consistent with C99 section 6.10.3.4, as well as C++ standard section 16.3.4, Rescanning and further replacement:

After all parameters in the replacement list have been substituted and # and ## processing has taken place, all placemarker preprocessing tokens are removed. Then, the resulting preprocessing token sequence is rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.

like image 160
Filipe Gonçalves Avatar answered Sep 29 '22 09:09

Filipe Gonçalves