Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do std::strings end in '\0' when initialized with a string literal?

I know string objects aren't null terminated but why should this work?

std::string S("Hey");
for(int i = 0; S[i] != '\0'; ++i)
   std::cout << S[i];

So the constructor copies the null terminator as well, but does not increment the length? Why does it bother?

like image 654
nek28 Avatar asked Nov 21 '16 09:11

nek28


2 Answers

std::string stores its data internally in the form of a null-terminated C-string, but in normal usage does not allow you to access the null terminator.

For example, if I assign the value "Hello, World!" to a string, the internal buffer will look like this:

std::string myString("Hello, World!");

// Internal Buffer...
// [ H | e | l | l | o | , |   | W | o | r | d | ! | \0 ]
//                                                   ^ Null terminator.

In this example, the null terminator was NOT copied from the end of the string literal, but added internally by std::string.

As @songyuanyao mentions in his answer, the result of this is that myString[myString.size()]; returns '\0'.

So why does std::string assign a null terminator to the end of the string? It certainly doesn't have to support one, because you can add '\0' to a string and it is included in the string:

std::string myString;
myString.size();              // 0
myString.push_back('\0');
myString.size();              // 1

The reason for this behavior is to support the std::string::c_str() function. The c_str() function is required to return a null-terminated const char *. The most efficient way to do this is to simply return a pointer to the internal buffer, but in order to do that the internal buffer must include a null terminator character at the end of the string. Since C++11, strings are required to include the null terminator to support this.

P.S. While not strictly part of your question, it should be pointed out that the loop from your question might NOT return a full string if your string includes null characters:

std::string S("Hey");
S.push_back('\0');
S.append("Jude");

for(int i = 0; S[i] != '\0'; ++i)
    std::cout << S[i];

// Only "Hey" is printed!
like image 104
Karl Nicoll Avatar answered Sep 19 '22 23:09

Karl Nicoll


So the constructor copies the null terminator as well, but does not increment the length?

As you've known that std::string doesn't contain the null character (and it doesn't copy the null character here).

The point is that you're using std::basic_string::operator[]. According to C++11, std::basic_string::operator[] will return a null character when specified index is equivalent to size().

If pos == size(), a reference to the character with value CharT() (the null character) is returned.

For the first (non-const) version, the behavior is undefined if this character is modified to any value other than charT().

like image 40
songyuanyao Avatar answered Sep 20 '22 23:09

songyuanyao