Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will C++11 std::string::operator[] return null-terminated buffer

Tags:

c++

c++11

I have an object of the std::string class that I need to pass to C function that operates the char* buffer by iterating over it and searching for the null terminated symbol.

So, I have something like this:

// C function
void foo(char* buf);

// C++ code
std::string str("str");
foo(&str[0]);

Suppose that we use C++11, so we have a guarantee that std::string representation will have contiguously stored characters.

But I wonder is there any guarantee that &str[0] will point to the buffer that ends with \0? Yeah, there's c_str member function but I'm talking about operator[].

Can somebody quote the standard please?

like image 955
FrozenHeart Avatar asked Aug 10 '16 12:08

FrozenHeart


People also ask

Does C_STR () return a null character?

Although c_str () returns a null terminated version of the std::string, surprises may await when mixing C++ std::string with C char* strings. Null characters may end up within a C++ std::string, which can lead to subtle bugs as C functions will see a shorter string.

Will std::string always be null-terminated in C++11?

Will std::string always be null-terminated in C++11? Bookmark this question. Show activity on this post. There is an active proposal to tighten this up further in C++0x and require null-termination and possibly ban copy-on-write implementations, for concurrency-related reasons.

Is it possible to null terminate a character array in C++?

There is std::string_view::data () but, like std::string::data (), that doesn’t guarantee the the character array will be null-terminated. ( update: since C++11, std::string::data () is guaranteed to be null-terminated, but std::string_view::data () in C++17 is not.)

Is the string literal (“ABC”) null terminated?

However, though the string literal (“abc”) is null-terminated, and the std::string is almost-certainly null-terminated (but implementation defined), our use_string () function cannot know for sure that the underlying array is null terminated. It could have been called liked so: or as a part of a much larger string that we are parsing.


1 Answers

In practice, yes. There are exactly zero implementations of std::string that are standards-comforming that do not store a NUL character at the end of the buffer.

So if you aren't wondering for wondering sake, you are done.

However, if you are wondering about the standard being abtruse:


In C++14, yes. There is a clear requirement that [] return a contiguous set of elements, and [size()] must return a NUL character, and const methods may not modify state. So *((&str[0])+size()) must be the same as str[size()], and str[size()] must be a NUL, thus game over.


In C++11, almost certainly. There are rules that const methods may not modify state. There are guarantees that data() and c_str() return a null-terminated buffer that agrees with [] at each point.

A convoluted reading of C++11 standard would state that prior to any call of data() or c_str(), [size()] doesn't return the NUL terminator at the end of the buffer but rather a static const CharT that is stored separately, and the buffer has an unitialized (or even a trap value) where NUL should be. Due to the requirement that const methods not modify state I believe this reading is incorrect.

This requires &str[str.size()] change between calls to .data(), which is an observable change in state in string over a const call, which I would read as being illegal.

An alternative way to get around the standard might be to not initialize str[str.size()] until you legally access it via calling .data(), .c_str() or actually passing str.size() to operator[]. As there are no defined ways to access that element other than those 3 in the standard, you could stretch things and say lazy initialization of the NUL is legal.

I'd question this, as the definition of .data() implies that the return value of [] is contiguous, so &[0] is the same address as .data(), and .data()+.size() is guaranteed to point to a NUL CharT so must (&[0])+.size(), and with no non-const methods called the state of the std::string may not change between the calls.

But, what if the fact the compiler can look and see you'll never call .data() or .c_str(), does the requirement of contiguity hold if it can be proven you never call them?

At which point I'd throw my hands up and shoot the hostile compiler.


The standard is very passively voiced about this. So there may be a way to make an arguably standards conforming std::string that doesn't follow these rules. And because the guarantees get closer and closer to explicitly requiring that NUL terminator there, the odds against a new compiler showing up that uses a tortured reading of C++ to claim this is standards compliant is low.

like image 157
Yakk - Adam Nevraumont Avatar answered Sep 23 '22 16:09

Yakk - Adam Nevraumont