Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Doesn't string::data() Provide a Mutable char*?

Tags:

In c++11 array, string, and vector all got the data method which:

Returns pointer to the underlying array serving as element storage. The pointer is such that range [data(); data() + size()) is always a valid range, even if the container is empty. [Source]

This method is provided in a mutable and const version for all applicable containers, for example:

T* vector<T>::data(); const T* vector<T>::data() const; 

All applicable containers, that is, except string which only provides the const version:

const char* string::data() const; 

What happened here? Why did string get shortchanged, when char* string::data() would be so helpful?

like image 222
Jonathan Mee Avatar asked Dec 08 '15 12:12

Jonathan Mee


People also ask

Is strings are mutable in C++?

Strings in C++ are mutable, but with great power comes great responsibility: you will get undefined behavior if you read from or store to string memory that is out of bounds.

Why is C++ string mutable?

Strings are mutable i.e. they can be changed after initialization. Strings are immutable as we are using an array to represent them. String in Java is slower when modified as compared to the StringBuffer class. C++ string class in STL are slower in implementation than Strings declared using character array.

Is char * a string?

char is a primitive data type whereas String is a class in java. char represents a single character whereas String can have zero or more characters. So String is an array of chars.

What is std::string data?

The std::string type is the main string datatype in standard C++ since 1998, but it was not always part of C++. From C, C++ inherited the convention of using null-terminated strings that are handled by a pointer to their first element, and a library of functions that manipulate such strings.


1 Answers

The short answer is that c++17 does provide the char* string::data() method. Which is vital for the similarly c++17 data function, thus to gain mutable access to the underlying C-String I can now do this:

auto foo = "lorem ipsum"s;  for(auto i = data(foo); *i != '\0'; ++i) ++(*i); 

For historical purposes it's worth chronicling string's development which c++17 is building upon: In c++11 access to string's underlying buffer is made possible possible by a new requirement that it's elements are stored contiguously such that for any given string s:

&*(s.begin() + n) == &*s.begin() + n for any n in [0, s.size()), or, equivalently, a pointer to s[0] can be passed to functions that expect a pointer to the first element of a CharT[] array.

Mutable access to this newly required underlying C-String was obtainable by various methods, for example: &s.front(), &s[0], or &*s.first() But back to the original question which would avoid the burden of using one of these options: Why hasn't access to string's underlying buffer been provided in the form of char* string::data()?

To answer that it is important to note that T* array<T>::data() and T* vector<T>::data() were an addition required by c++11. No additional requirements were incurred by c++11 against other contiguous containers such as deque. And there certainly wasn't an additional requirement for string, in fact the requirement that string was contiguous was new to c++11. Before this const char* string::data() had existed. Though it explicitly was not guaranteed to be pointing to any underlying buffer, it was the only way to obtain a const char* from a string:

The returned array is not required to be null-terminated.

This means that string was not "shortchanged" in c++11's transition to data accessors, it simply was not included thus only the const data accesor that string previously possessed persisted. There are naturally occurring examples in C++11's implementation which necessitate writing directly to the underlying buffer of a string.

like image 119
Jonathan Mee Avatar answered Sep 20 '22 15:09

Jonathan Mee