Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does std::string need to store its character in a contiguous piece of memory?

I know that in C++98, neither std::basic_string<> nor std::vector<> were required to use contiguous storage. This was seen as an oversight for std::vector<> as soon as it was pointed out, and, if I remember correctly, got fixed with C++03.

I seem to remember having read about discussions requiring std::basic_string<> to use contiguous storage back when C++11 was still called C++0x, but I haven't followed the discussion closely back then, and am still restricted to C++03 at work, so I am not sure what became of it.

So is std::basic_string<> required to use contiguous storage? (If so, then which version of the standard required it first?)

In case you wonder: This is important if you have code passing the result of &str[0] to a function expecting a contiguous piece of memory to write to. (I know about str.data(), but for obvious reasons old code doesn't use it.)

like image 468
sbi Avatar asked Oct 14 '15 11:10

sbi


People also ask

Is std::string contiguous?

The std::string class manages the underlying storage for you, storing your strings in a contiguous manner. You can get access to this underlying buffer using the c_str() member function, which will return a pointer to null-terminated char array.

How are C++ strings stored in memory?

@user1145902: They are stored in memory like in an array, but that memory is not allocated in the stack (or wherever the string object is), but rather in a dynamically allocated buffer.

Does std::string allocate memory?

While std::string has the size of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation.

When a string is stored what marks the end of the string in memory?

In this last byte, the number 0 is stored. It is called the null terminator or null characters, and it marks the end of the string.


Video Answer


3 Answers

The C++11 standard, basic_string 21.4.1.5,

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

like image 176
Rahul Tripathi Avatar answered Oct 16 '22 13:10

Rahul Tripathi


In c++03 there was no guarantee that that the elements of the string are stored continiously. [basic.string] was

  1. For a char-like type charT, the class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects (clause 21). The first element of the sequence is at position zero. Such a sequence is also called a “string” if the given char-like type is clear from context. In the rest of this clause, charT denotes such a given char-like type. Storage for the string is allocated and freed as necessary by the member functions of class basic_string, via the Allocator class passed as template parameter. Allocator::value_type shall be the same as charT.
  2. The class template basic_string conforms to the requirements of a Sequence, as specified in (23.1.1). Additionally, because the iterators supported by basic_string are random access iterators (24.1.5), basic_string conforms to the the requirements of a Reversible Container, as specified in (23.1). 389 ISO/IEC 14882:2003(E)  ISO/IEC 21.3 Class template basic_string 21 Strings library
  3. In all cases, size() <= capacity().

And then in C++17 they changed it too

  1. The class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects with the first element of the sequence at position zero. Such a sequence is also called a “string” if the type of the char-like objects that it holds is clear from context. In the rest of this Clause, the type of the char-like objects held in a basic_string object is designated by charT.
  2. The member functions of basic_string use an object of the Allocator class passed as a template parameter to allocate and free storage for the contained char-like objects.233
  3. A basic_string is a contiguous container (23.2.1).
  4. In all cases, size() <= capacity().

emphasis mine

So pre C++17 it was not guaranteed but now it is.

With the constraints that std::string::data imposes this non guarantee is almost moot as calling std::string::data gives you a continuous array of the characters in the string. So unless the implementation is doing this on demand and in constant time the string will be continuous.


In case you wonder: This is important if you have code passing the result of &str[0] to a function expecting a contiguous piece of memory to write to. (I know about str.data(), but for obvious reasons old code doesn't use it.)

The behavior of operator[] has changed as well. In C++03 we had

Returns: If pos < size(), returns data()[pos]. Otherwise, if pos == size(), the const version returns charT(). Otherwise, the behavior is undefined.

So only the const version was guaranteed to have defined behavior if you tried &s[0] when s is empty. In C++11 they changed it to:

Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior.

So now both the const and non const versions have defined behavior if you tried &s[0] when s is empty.

like image 31
NathanOliver Avatar answered Oct 16 '22 12:10

NathanOliver


According to the draft standard N4527 21.4/3 Class template basic_string [basic.string] :

A basic_string is a contiguous container (23.2.1).

like image 2
101010 Avatar answered Oct 16 '22 12:10

101010