I'm learning about raw strings in C++ from a cplusplus.com tutorial on constants. Based on the definition on that site, a raw string should start with <code>R"sequence(</code> and end with <code>)sequence</code> where <code>sequence</code> can be any sequence of characters. One of the examples of the website is the following: <code>R"&%$(string with \backslash)&%$"</code> However, when I try to compile the code that contains the above raw string, I get a compilation error. <pre class="prettyprint"><code>test.cpp:5:28: error: invalid character '$' in raw string delimiter 5 | std::string str = R"&%$(string with \backslash)&%$"; | ^ test.cpp:5:23: error: stray 'R' in program </code></pre> I tried it with g++ and clang++ on both Windows and Linux. None of them worked.

From C++ reference: <blockquote> delimiter: A character sequence made of any source character but parentheses, backslash and spaces (can be empty, and at most 16 characters long) </blockquote> Note the "any source character" part here. Let us look at what the standard says: From [gram.lex]: <blockquote> raw-string: "d-char-sequenceopt(r-char-sequenceopt)d-char-sequenceopt" ... d-char-sequence: d-char d-char-sequence d-char d-char: any member of the basic source character set except: space, the left parenthesis <code>(</code>, the right parenthesis <code>)</code>, the backslash <code>\</code>, and the control characters representing horizontal tab, vertical tab, form feed, and newline. </blockquote> Well, what is the basic source character set? From [lex.charset]: <blockquote> The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters: a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & |~! = , \ " ’ </blockquote> ... which does not include <code>$</code>; so the conclusion is that the dollar sign <code>$</code> cannot be part of the delimiter sequence.

For the basic source character set, see lex.charset 5.3 (1): that set does not contain the <code>$</code> character. For the allowed prefix characters in raw string literals, see lex.string 5.13.5: "/…/ any member of the basic source character set except: space, the left parenthesis <code>(</code>, the right parenthesis <code>)</code>, the backslash <code>\</code>, and the control characters representing horizontal tab, vertical tab, form feed, and newline." (emphasis mine).

What is the proper format of writing raw strings with '$' in C++?

Tags:

c++

string

I'm learning about raw strings in C++ from a cplusplus.com tutorial on constants. Based on the definition on that site, a raw string should start with R"sequence( and end with )sequence where sequence can be any sequence of characters.

One of the examples of the website is the following:

R"&%$(string with \backslash)&%$"

However, when I try to compile the code that contains the above raw string, I get a compilation error.

test.cpp:5:28: error: invalid character '$' in raw string delimiter
    5 |     std::string str = R"&%$(string with \backslash)&%$";
      |                       ^
test.cpp:5:23: error: stray 'R' in program

I tried it with g++ and clang++ on both Windows and Linux. None of them worked.

388

asked Feb 27 '21 16:02

Amirreza A.

3 Answers

From C++ reference:

delimiter: A character sequence made of any source character but parentheses, backslash and spaces (can be empty, and at most 16 characters long)

Note the "any source character" part here.

Let us look at what the standard says:

From [gram.lex]:

raw-string:
"d-char-sequence_opt(r-char-sequenceopt)d-char-sequence_opt"

...

d-char-sequence:
d-char
d-char-sequence d-char

d-char:
any member of the basic source character set except: space, the left parenthesis (, the right parenthesis ), the backslash \, and the control characters representing horizontal tab, vertical tab, form feed, and newline.

Well, what is the basic source character set? From [lex.charset]:

The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters:

a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & |~! = , \ " ’

... which does not include $; so the conclusion is that the dollar sign $ cannot be part of the delimiter sequence.

148

answered Oct 18 '22 03:10

ph3rin

For the basic source character set, see lex.charset 5.3 (1): that set does not contain the $ character. For the allowed prefix characters in raw string literals, see lex.string 5.13.5: "/…/ any member of the basic source character set except: space, the left parenthesis (, the right parenthesis ), the backslash \, and the control characters representing horizontal tab, vertical tab, form feed, and newline." (emphasis mine).

answered Oct 18 '22 03:10

heap underrun

Just remove $ like the code below :

string string3 = R"&%(string with \backslash)&%";

$ gives error because the basic source character set does not have $ as said in the comments.

The individual bytes of the source code file are mapped (in implementation-defined manner) to the characters of the basic source character set. In particular, OS-dependent end-of-line indicators are replaced by newline characters. The basic source character set consists of 96 characters:

a) 5 whitespace characters (space, horizontal tab, vertical tab, form feed, new-line)

b) 10 digit characters from '0' to '9'

c) 52 letters from 'a' to 'z' and from 'A' to 'Z'

d) 29 punctuation characters: _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " ' 2) Any source file character that cannot be mapped to a character in the basic source character set is replaced by its universal character name (escaped with \u or \U) or by some implementation-defined form that is handled equivalently.

Ref : Click here

answered Oct 18 '22 02:10

Rohith V

Related questions
                            
                                How to reduce boilerplate currently necessary for serialization
                            
                                Why does C++ not know to do an implicit move in the return when the variable is used in an initializer list?
                            
                                Disallowing creation of the temporary objects
                            
                                C++0x - export gone, exception specs deprecated. Will this affect your code? [closed]
                            
                                Is rebasing DLLs (or providing an appropriate default load address) worth the trouble?
                            
                                Why assign a return value to a reference?
                            
                                Interoperability between boost::date_time and std::chrono
                            
                                A recommendation for a good programming book in Threads and Concurrency in c\c++ [closed]
                            
                                c & c++ default global variable linkage, multiple declaration & definition problem
                            
                                Why is there a dummy union member in some implementations of std::optional?
                            
                                Why are some functions in <cmath> not in the std namespace?
                            
                                CMake Difference between include_directories and add_subdirectory?
                            
                                union 'punning' structs w/ "common initial sequence": Why does C (99+), but not C++, stipulate a 'visible declaration of the union type'?
                            
                                Parameter pack must be at the end of the parameter list... When and why?
                            
                                How much should I worry about the Intel C++ compiler emitting suboptimal code for AMD?
                            
                                Good portable SIMD library [closed]
                            
                                c++ passing arguments by reference and pointer
                            
                                Const vector of non-const objects
                            
                                C++ POD struct inheritance? Are there any guarantees about the memory layout of derived members
                            
                                Using boost::future with "then" continuations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With