Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is C/C++ preprocessor adding a space here?

I have a tiny problem with a preprocessor that puzzles me and I cannot find any explanation to it in the documentation/preprocessor/language spec.

#define booboo() aaa
booboo()bbb
booboo().bbb

is preprocessed into:

aaa bbb   <--- why is space added here
aaa.bbb

After handling trigraphs, continued lines and comments, preprocessor works on preprocessor directives and divides input into preprocessing tokens and whitespace. booboo's replacement list comprises one pp-token which is identifier 'aaa'. booboo()bbb is divided into pp-tokens: 'booboo', '(', ')', 'bbb'. Sequence of 'booboo', '(', ')' is recognised as functional macro invocation and it should be expanded to 'aaa' and imho in output should look like 'aaabbb'. I said look like since - to human - it would look like one token whereas compiler would get 2 tokens 'aaa' and 'bbb' since no '##' operator was used that allows pp-token concatenation. Why/what rule makes cpp (c preprocessor) place additional space between 'aaa' and 'bbb' when 'booboo().bbb' results in 'aaa.bbb' without space?

Is this because cpp tries to make output (which is for humans mostly) unambinuous? Human is not able to tell that 'aaabbb' is composed from 2 tokens as it sees token's spelling only. Am I right? I've read C99 documentation about preprocessor and gcc's documentation for cpp. I see nothing about it.

If I am right we have similar situation here:

#define baba() +
baba()+
baba()-

results in:

+ +
+-

Otherwise (if '++' is the output) it would look to a human like '++' token but there would be 2 tokens '+' and '+'. Is it like with '##' operator that cpp checks if concatenation produces valid token but in shown cases wants to prevent human that concatenation was performed? '+-' is not ambiguous hence no space added

like image 327
Artur Avatar asked Jun 24 '15 08:06

Artur


People also ask

How does the C preprocessor work?

The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control. In many C implementations, it is a separate program invoked by the compiler as the first part of translation.

Why does C have a preprocessor?

The C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs.

Does space matters in c++?

The C++ compiler generally ignores whitespace, with a few minor exceptions (when processing text literals). For this reason, we say that C++ is a whitespace-independent language.

What is preprocessor section in C?

We can consider a preprocessor as a compilation process, which runs when the developer runs the program. It is a pre-process of execution of a program using c/c++ language. To initialize a process of preprocessor commands, it's mandated to define with a hash symbol (#).


1 Answers

The result of preprocessing is to transform the source file into a list of tokens. In your case the list of tokens would look like, after tokenization:

....
booboo()
bbb
....

and then after macro replacement:

....
aaa
bbb
....

Then the compiler translates the list of tokens into an executable.

The whitespace you are seeing is just an implementation detail that your compiler etc. has chosen to lay out the preprocessing tokens when displaying an intermediate result to you. The standards say nothing about any intermediate processing files. It is not required that there be a separate program to do preprocessing either.

like image 90
M.M Avatar answered Sep 19 '22 18:09

M.M