Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it mandatory to escape tabulator characters in C and C++?

Tags:

c++

c

In C and C++ (and several other languages) horizontal tabulators (ASCII code 9) in character and string constants are denoted in escaped form as '\t' and "\t". However, I am regularly typing the unescaped tabulator character in string literals as for example in "A B" (there is a TAB in betreen A and B), and at least clang++ does not seem to bother - the string seems to be equivalent to "A\tB". I like the unescaped version better since long indented multi-line strings are better readable in the source code.

Now I am asking myself whether this is generally legal in C and C++ or just supported by my compiler. How portable are unescaped tabulators in character and string constants?

Surprisingly I could not find an answer to this seemingly simple question, neither with Google nor on stackoverflow (I just found this vaguely related question).

like image 240
tglas Avatar asked Mar 06 '15 14:03

tglas


People also ask

What is need of escape sequence in C?

Escape sequences They are primarily used to put nonprintable characters in character and string literals. For example, you can use escape sequences to put such characters as tab, carriage return, and backspace into an output stream.

How many escape sequence are there in C?

There are 15 types of escape sequence in C to achieve various purposes.

What are escape characters in C language?

An escape sequence in C language is a sequence of characters that doesn't represent itself when used inside string literal or character. It is composed of two or more characters starting with backslash \. For example: \n represents new line.

What is the purpose of escape sequence?

Escape sequences are typically used to specify actions such as carriage returns and tab movements on terminals and printers. They are also used to provide literal representations of nonprinting characters and characters that usually have special meanings, such as the double quotation mark (").


2 Answers

Yes, you can include a tab character in a string or character literal, at least according to C++11. The allowed characters include (with my emphasis):

any member of the source character set except the double-quote ", backslash \, or new-line character

(from C++11 standard, annex A.2)

and the source character set includes:

the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters

(from C++11 standard, paragraph 2.3.1)

UPDATE: I've just noticed that you're asking about two different languages. For C99, the answer is also yes. The wording is different, but basically says the same thing:

In a character constant or string literal, members of the execution character set shall be represented by corresponding members of the source character set or [...]

where both the source and execution character sets include

control characters representing horizontal tab, vertical tab, and form feed.

like image 90
Mike Seymour Avatar answered Oct 18 '22 23:10

Mike Seymour


It's completely legal to put a tab character directly into a character string or character literal. The C and C++ standards require the source character set to include a tab character, and string and character literals may contain any character in the source character set except backslash, quote or apostrophe (as appropriate) and newline.

So it's portable. But it is not a good idea, since there is no way a reader can distinguish between different kinds of whitespace. It is also quite common for text editors, mail programs, and the like to reformat tabs, so bugs may be introduced into the program in the course of such operations.

like image 27
rici Avatar answered Oct 19 '22 00:10

rici