Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Carriage return + newline in raw string literals?

Consider a C++ file that has UNIX line endings (i.e. '\x0a' instead of "\x0d\x0a") and includes following raw string literal:

const char foo[] = R"(hello^M
)";

(where ^M is the actual byte 0x0d (i.e. carriage return)).

What should be the result of following string comparison (when taking the standard's definition of raw string literals into account)?

strcmp("hello\r\n", foo);

Should the strings compare to equal or not? (i.e. 0 or !=0 ?)

With GCC 4.8 (on Fedora 19) they compare unequal.

Is this a bug or feature in GCC?

like image 373
maxschlepzig Avatar asked Apr 05 '14 07:04

maxschlepzig


1 Answers

As far as the standard is concerned, you can only use members of the basic source character set in the string literals (and elsewhere in the program). How the physical representation of the program is mapped to the basic source character set is implementation-defined.

g++ apparently thinks that ASCII \x0A, ASCII \x0D, and ASCII \x0D\x0A are all valid representations of the member of the basic source character set called "newline". Which is totally reasonable, given that it is desirable for source code transferred between Windows, Unix and Mac OS X Classic machines to keep its meaning.

like image 164
n. 1.8e9-where's-my-share m. Avatar answered Sep 19 '22 09:09

n. 1.8e9-where's-my-share m.