It's pretty common to use macros and token concatenation to switch between wide and narrow strings at compile time.
#define _T(x) L##x
const wchar_t *wide1 = _T("hello");
const wchar_t *wide2 = L"hello";
And in C++11 it should be valid to concoct a similar thing with raw strings:
#define RAW(x) R##x
const char *raw1 = RAW("(Hello)");
const char *raw2 = R"(Hello)";
Since macro expansion and token concatenation happens before escape sequence substitution, this should prevent escape sequences being replaced in the quoted string.
But how does this apply to trigraphs? Are raw strings formed through concatenation with normal strings still subject to having their trigraph substitutions reverted?
const char *trigraph = RAW("(??=)"); // Is this "#" or "??="?
No, the trigraph is not reverted in your example.
[lex.phases]p1
identifies three phases of translation relevant to your question:
1. Trigraph sequences are replaced by corresponding single-character internal representations.
3. The source file is decomposed into preprocessing tokens.
4. Macro invocations are expanded.
Phase 1 is defined by [lex.trigraph]p1
. At this stage, your code is translated to const char *trigraph = RAW("(#)")
.
Phase 3 is defined by [lex.pptoken]
. This is the stage where trigraphs are reverted in raw string literals. Paragraph 3 says:
If the next character begins a sequence of characters that could be the prefix and initial double quote of a raw string literal, such as R", the next preprocessing token shall be a raw string literal. Between the initial and final double quote characters of the raw string, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted.
That is not the case in your example, therefore the trigraph is not reverted. Your code is transformed into the preprocessing-token sequence const
char
*
trigraph
=
RAW
(
"(#)"
)
Finally, in phase 4, the RAW
macro is expanded and the token-paste occurs, resulting in the following sequence of preprocessing-tokens: const
char
*
trigraph
=
R"(#)"
. The r-char-sequence of the string literal comprises a #
. Phase 3 has already occurred, and there is no other point at which reversion of trigraphs occurs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With