Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this C program compile without an error?

Tags:

c

I'm a beginner in C, and I was playing with C. I typed a C code like this:

#include <stdio.h>
int main()
{
    printf("hello world\n"); 
    \
    return 0;
}

Even though I used \ knowingly, the C compiler doesn't throw any error. What is this symbol used for in the C language?

Edit:

Even this works:

"\n";
like image 873
Ant's Avatar asked Apr 06 '12 15:04

Ant's


1 Answers

The sequence backslash-newline is removed from the code in a very early phase (phase 2) of the translation process. It used to be how you created long string literals before there was string concatenation, and is how you still extend macros over multiple lines.

See §5.1.1.2 Translation Phases of the C99 standard:

The precedence among the syntax rules of translation is specified by the following phases.5)

  1. Physical source file multibyte characters are mapped, in an implementation defined manner, to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations.
  2. Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place.
  3. The source file is decomposed into preprocessing tokens6) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment. Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined.
  4. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted.
  5. Each source character set member and escape sequence in character constants and string literals is converted to the corresponding member of the execution character set; if there is no corresponding member, it is converted to an implementation defined member other than the null (wide) character.7)
  6. Adjacent string literal tokens are concatenated.
  7. White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting tokens are syntactically and semantically analyzed and translated as a translation unit.
  8. All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment.

5) Implementations shall behave as if these separate phases occur, even though many are typically folded together in practice.

6) As described in 6.4, the process of dividing a source file’s characters into preprocessing tokens is context-dependent. For example, see the handling of < within a #include preprocessing directive.

7) An implementation need not convert all non-corresponding source characters to the same execution character.

If you had a blank or any other character after your stray backslash, you would have a compilation error. We can tell that you don't have anything after it because you don't have a compilation error.


The other part of your question, about:

"\n";

is quite different. It is a simple expression that has no side-effects and therefore no effect on the program. The optimizer will completely discard it. When you write:

i = 1;

you have an expression with a value that is discarded; it is evaluated for its side-effect of modifying i.

Sometimes, you'll find code like:

*ptr++;

The compiler will warn you that the result of the expression is discarded; the expression can be simplified to:

ptr++;

and will achieve the same effect in the program.

like image 62
Jonathan Leffler Avatar answered Oct 24 '22 06:10

Jonathan Leffler