Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of null character in strings (C++)

I am brushing up on my C++ and stumbled across a curious behavior in regards to strings, character arrays, and the null character ('\0'). The following code:

#include <iostream>
using namespace std;

int main() {
    cout << "hello\0there"[6] << endl;

    char word [] = "hello\0there";
    cout << word[6] << endl;

    string word2 = "hello\0there";
    cout << word2[6] << endl;

    return 0;
}

produces the output:

> t
> t
>

What is going on behind the scenes? Why does the string literal and the declared char array store the 't' at index 6 (after the internal '\0'), but the declared string does not?

like image 508
ewok Avatar asked Jul 20 '12 15:07

ewok


People also ask

What is the use of null in character array in C?

Strings are actually one-dimensional array of characters terminated by a null character '\0'. Thus a null-terminated string contains the characters that comprise the string followed by a null.

What is null string in C programming?

A null string has no values. It's an empty char array, one that hasn't been assigned any elements. The string exists in memory, so it's not a NULL pointer. It's just absent any elements. An empty string has a single element, the null character, '\0' .

Why do we use null character in string not in array?

Most string-manipulating functions relies on NULL to know when the string is finished (and its job is done), and won't work with simple char-array (eg. they'll keep on working past the boundaries of the array, and continue until it finds a NULL somewhere in memory - often corrupting memory as it goes).


3 Answers

From what I remember, the first two are in essence just an array and the way a string is printed is to continue to print until a \0 is encounterd. Thus in the first two examples you start at the point offset of the 6th character in the string, but in your case you are printing out the 6th character which is t.

What happens with the string class is that it makes a copy of the string into it's own internal buffer and does so by copying the string from the start of the array up to the first \0 it finds. Thus the t is not stored because it comes after the first \0.

like image 181
sean Avatar answered Oct 20 '22 17:10

sean


Because the std::string constructor that takes a const char* treats its argument as a C-style string. It simply copies from it until it hits a null-terminator, then stops copying.

So your last example is actually invoking undefined behaviour; word2[6] goes past the end of the string.

like image 30
Oliver Charlesworth Avatar answered Oct 20 '22 16:10

Oliver Charlesworth


You are constructing a string from a char* (or something that decayed to that). This means that the convention for C-strings apply. That is they are '\0' terminated. That's why word2 only contains "hello".

like image 4
RedX Avatar answered Oct 20 '22 17:10

RedX