What does '\' actually do in C?

Question

As far as I know \ in C just appends the next line as if there was not a line break.

Consider the following code:

main(){\
return 0;
}

When I saw the pre-processed code(gcc -E) it shows

main(){return
       0;
}

and not

main(){return 0;
}

What is the reason for this kind of behaviour? Also, how can I get the code I expected?

Potatoswatter · Accepted Answer

Yes, your expected result is the one required by the C and C++ standards. The backslash simply escapes the newline, i.e. the backslash-newline sequence is deleted.

GCC 4.2.1 from my OS X installation gives the expected result, as does Clang. Furthermore, adding a #define to the beginning and testing with

#define main(){\
return 0;
}
main()

yields the correct result

}
{return 0;

Perhaps gcc -E does some extra processing after preprocessing and before outputting it. In any case, the line break seen by the rest of the preprocessor seems to be in the right place. So it's a cosmetic bug.

UPDATE: According to the GCC FAQ, -E (or the default setting of the cpp command) attempts to put output tokens in roughly the same visual location as input tokens. To get "raw" output, specify -P as well. This fixes the observed issues.

Probably what happened:

In preserving visual appearance, tokens not separated by spaces are kept together.
Line splicing happens before spaces are identified for the above.
The { and return tokens are grouped into the same visual block.
0 follows a space and its location on the next line is duly noted.

PLUG: If this is really important to you, I have implemented my own preprocessor with correct implementation of both raw-preprocessed and whitespace-preserving "pretty" modes. Following this discussion I added line splices to the preserved whitespace. It's not really intended as a standalone tool, though. It's a testbed for a compiler framework which happens to be a fully compliant C++11 preprocessor library, which happens to have a miniature command-line driver. (The error messages are on par with GCC, or Clang, sans color, though.)

jpw · Answer

From K&R section A.12 Preprocessing:

A.12.2 Line Splicing

Lines that end with the backslash character \ are folded by deleting the backslash and the following newline character. This occurs before division into tokens.

sehe · Answer

It doesn't matter :/ The tokenizer will not see any difference. ¹

Update In response to the comments:

There seems to be a fair amount of confusion as to what the expected output of the preprocessor should be. My point is that the expectation /seems/ reasonable at a glance but doesn't actually need to be specified in this way for the output to be valid. The amount of whitespace present in the output is simply irrelevant to the parser. What matters is that the preprocessor should treat the continued line as one line while interpreting it.

In other words: the preprocessor is not a text transformation tool, it's a token manipulation tool.

If it matters to you, you're probably

using the preprocessor for for something other than C/C++
treating C++ code as text, which is a ... code smell. (libclang and various less complete parser libraries come to mind).

¹ (The preprocessor is free to achieve the specified result in whichever way it sees fit. The result you are seeing is possibly the most efficient way the implementors have found to implement this particular transformation)

What does '\' actually do in C?

Tags:

c

c-preprocessor

gcc

banarun

3 Answers

Potatoswatter

jpw

sehe

Recent Activity

Donate For Us

What does '\' actually do in C?

Tags:

c

c-preprocessor

gcc

banarun

3 Answers

Potatoswatter

jpw

sehe

Related questions

Recent Activity

Donate For Us