I created a new project with the following code segment:
char* strange = "(Strange??)";
cout << strange << endl;
resulting in the following output:
(Strange]
Thus translating '??)' -> ']'
Debugging it shows that my char* string literal is actually that value and it's not a stream translation. This is obviously not a meta-character sequence I've ever seen. Some sort of Unicode or wide char sequence perhaps? I don't think so however... I've tried disabling all related project settings to no avail.
Anyone have an explanation?
Character literals for C and C++ are char, string, and their Unicode and Raw type. Also, there is a multi-character literal that contains more than one c-char. A single c-char literal has type char and a multi-character literal is conditionally-supported, has type int, and has an implementation-defined value .
A "string literal" is a sequence of characters from the source character set enclosed in double quotation marks ( " " ). String literals are used to represent a sequence of characters which, taken together, form a null-terminated string. You must always prefix wide-string literals with the letter L. char *amessage = "This is a string literal.";
A single c-char literal has type char and a multi-character literal is conditionally-supported, has type int, and has an implementation-defined value . Want to learn from the best curated videos and practice problems, check out the C++ Foundation Course for Basic to Advanced C++ and C++ STL Course for foundation plus STL.
This question (about the closely related digraphs) has the answer. It boils down to the fact that the ISO 646 character set doesn't have all the characters of the C syntax, so there are some systems with keyboards and displays that can't deal with the characters (though I imagine that these are quite rare nowadays).
What you're seeing is called a trigraph.
In written language by grown-ups, one question mark is sufficient for any situation. Don't use more than one at a time and you'll never see this again.
GCC ignores trigraphs by default because hardly anyone uses them intentionally. Enable them with the -trigraph
option, or tell the compiler to warning you about them with the -Wtrigraphs
option.
Visual C++ 2010 also disables them by default and offers /Zc:trigraphs
to enable them. I can't find anything about ways to enable or disable them in prior versions.
Easy way to avoid the trigraph surprise: split a "??" string literal in two:
char* strange = "(Strange??)";
char* strange2 = "(Strange?" "?)";
/* ^^^ no punctuation */
Edit
gcc has an option to warn about trigraphs: -Wtrigraphs
(enabled with -Wall
also)
end edit
Quotes from the Standard
5.2.1.1 Trigraph sequences 1 Before any other processing takes place, each occurrence of one of the following sequences of three characters (called trigraph sequences13)) is replaced with the corresponding single character. ??= # ??) ] ??! | ??( [ ??' ^ ??> } ??/ \ ??< { ??- ~ No other trigraph sequences exist. Each ? that does not begin one of the trigraphs listed above is not changed.
5.1.1.2 Translation phases 1 The precedence among the syntax rules of translation is specified by the following phases. 1. Physical source file multibyte characters are mapped, in an implementation-defined manner, to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations.
It's a Trigraph!
??) is a trigraph.
That's trigraph support. You can prevent trigraph interpretation by escaping any of the characters:
char* strange = "(Strange?\?)";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With