Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does '\' actually do in C?

As far as I know \ in C just appends the next line as if there was not a line break.

Consider the following code:

main(){\
return 0;
}

When I saw the pre-processed code(gcc -E) it shows

main(){return
       0;
}

and not

main(){return 0;
}

What is the reason for this kind of behaviour? Also, how can I get the code I expected?

like image 986
banarun Avatar asked Jun 25 '13 06:06

banarun


3 Answers

Yes, your expected result is the one required by the C and C++ standards. The backslash simply escapes the newline, i.e. the backslash-newline sequence is deleted.

GCC 4.2.1 from my OS X installation gives the expected result, as does Clang. Furthermore, adding a #define to the beginning and testing with

#define main(){\
return 0;
}
main()

yields the correct result

}
{return 0;

Perhaps gcc -E does some extra processing after preprocessing and before outputting it. In any case, the line break seen by the rest of the preprocessor seems to be in the right place. So it's a cosmetic bug.

UPDATE: According to the GCC FAQ, -E (or the default setting of the cpp command) attempts to put output tokens in roughly the same visual location as input tokens. To get "raw" output, specify -P as well. This fixes the observed issues.

Probably what happened:

  1. In preserving visual appearance, tokens not separated by spaces are kept together.
  2. Line splicing happens before spaces are identified for the above.
  3. The { and return tokens are grouped into the same visual block.
  4. 0 follows a space and its location on the next line is duly noted.

PLUG: If this is really important to you, I have implemented my own preprocessor with correct implementation of both raw-preprocessed and whitespace-preserving "pretty" modes. Following this discussion I added line splices to the preserved whitespace. It's not really intended as a standalone tool, though. It's a testbed for a compiler framework which happens to be a fully compliant C++11 preprocessor library, which happens to have a miniature command-line driver. (The error messages are on par with GCC, or Clang, sans color, though.)

like image 118
Potatoswatter Avatar answered Jan 02 '23 00:01

Potatoswatter


From K&R section A.12 Preprocessing:

A.12.2 Line Splicing

Lines that end with the backslash character \ are folded by deleting the backslash and the following newline character. This occurs before division into tokens.

like image 29
jpw Avatar answered Jan 02 '23 02:01

jpw


It doesn't matter :/ The tokenizer will not see any difference. 1

Update In response to the comments:

There seems to be a fair amount of confusion as to what the expected output of the preprocessor should be. My point is that the expectation /seems/ reasonable at a glance but doesn't actually need to be specified in this way for the output to be valid. The amount of whitespace present in the output is simply irrelevant to the parser. What matters is that the preprocessor should treat the continued line as one line while interpreting it.

In other words: the preprocessor is not a text transformation tool, it's a token manipulation tool.

If it matters to you, you're probably

  • using the preprocessor for for something other than C/C++
  • treating C++ code as text, which is a ... code smell. (libclang and various less complete parser libraries come to mind).

1 (The preprocessor is free to achieve the specified result in whichever way it sees fit. The result you are seeing is possibly the most efficient way the implementors have found to implement this particular transformation)

like image 22
sehe Avatar answered Jan 02 '23 02:01

sehe