Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the std::string constructor handle char[] of fixed size?

Tags:

c++

string

c++11

How does the string constructor handle char[] of a fixed size when the actual sequence of characters in that char[] could be smaller than the maximum size?

char foo[64];//can hold up to 64
char* bar = "0123456789"; //Much less than 64 chars, terminated with '\0'
strcpy(foo,bar); //Copy shorter into longer
std::string banz(foo);//Make a large string

In this example will the size of the banz objects string be based on the original char* length or the char[] that it is copied into?

like image 319
hehe3301 Avatar asked Dec 14 '22 14:12

hehe3301


1 Answers

First you have to remember (or know) that char strings in C++ are really called null-terminated byte strings. That null-terminated bit is a special character ('\0') that tells the end of the string.

The second thing you have to remember (or know) is that arrays naturally decays to pointers to the arrays first element. In the case of foo from your example, when you use foo the compiler really does &foo[0].

Finally, if we look at e.g. this std::string constructor reference you will see that there is an overload (number 5) that accepts a const CharT* (with CharT being a char for normal char strings).

Putting it all together, with

std::string banz(foo);

you pass a pointer to the first character of foo, and the std::string constructor will treat it as a null-terminated byte string. And from finding the null-terminator it knows the length of the string. The actual size of the array is irrelevant and not used.

If you want to set the size of the std::string object, you need to explicitly do it by passing a length argument (variant 4 in the constructor reference):

std::string banz(foo, sizeof foo);

This will ignore the null-terminator and set the length of banz to the size of the array. Note that the null-terminator will still be stored in the string, so passing a pointer (as retrieved by e.g. the c_str function) to a function which expects a null-terminated string, then the string will seem short. Also note that the data after the null-terminator will be uninitialized and have indeterminate contents. You must initialize that data before you use it, otherwise you will have undefined behavior (and in C++ even reading indeterminate data is UB).


As mentioned in a comment from MSalters, the UB from reading uninitialized and indeterminate data also goes for the construction of the banz object using an explicit size. It will typically work and not lead to any problems, but it does break the rules set out in the C++ specification.

Fixing it is easy though:

char foo[64] = { 0 };//can hold up to 64

The above will initialize all of the array to zero. The following strcpy call will not touch the data of the array beyond the terminator, and as such the remainder of the array will be initialized.

like image 156
Some programmer dude Avatar answered Dec 25 '22 22:12

Some programmer dude