In c++11 array
, string
, and vector
all got the data
method which:
Returns pointer to the underlying array serving as element storage. The pointer is such that range [
data()
;data() + size()
) is always a valid range, even if the container is empty. [Source]
This method is provided in a mutable and const
version for all applicable containers, for example:
T* vector<T>::data(); const T* vector<T>::data() const;
All applicable containers, that is, except string
which only provides the const
version:
const char* string::data() const;
What happened here? Why did string
get shortchanged, when char* string::data()
would be so helpful?
Strings in C++ are mutable, but with great power comes great responsibility: you will get undefined behavior if you read from or store to string memory that is out of bounds.
Strings are mutable i.e. they can be changed after initialization. Strings are immutable as we are using an array to represent them. String in Java is slower when modified as compared to the StringBuffer class. C++ string class in STL are slower in implementation than Strings declared using character array.
char is a primitive data type whereas String is a class in java. char represents a single character whereas String can have zero or more characters. So String is an array of chars.
The std::string type is the main string datatype in standard C++ since 1998, but it was not always part of C++. From C, C++ inherited the convention of using null-terminated strings that are handled by a pointer to their first element, and a library of functions that manipulate such strings.
The short answer is that c++17 does provide the char* string::data()
method. Which is vital for the similarly c++17 data
function, thus to gain mutable access to the underlying C-String I can now do this:
auto foo = "lorem ipsum"s; for(auto i = data(foo); *i != '\0'; ++i) ++(*i);
For historical purposes it's worth chronicling string
's development which c++17 is building upon: In c++11 access to string
's underlying buffer is made possible possible by a new requirement that it's elements are stored contiguously such that for any given string s
:
&*(s.begin() + n) == &*s.begin() + n
for anyn
in [0
,s.size()
), or, equivalently, a pointer tos[0]
can be passed to functions that expect a pointer to the first element of aCharT[]
array.
Mutable access to this newly required underlying C-String was obtainable by various methods, for example: &s.front()
, &s[0]
, or &*s.first()
But back to the original question which would avoid the burden of using one of these options: Why hasn't access to string
's underlying buffer been provided in the form of char* string::data()
?
To answer that it is important to note that T* array<T>::data()
and T* vector<T>::data()
were an addition required by c++11. No additional requirements were incurred by c++11 against other contiguous containers such as deque
. And there certainly wasn't an additional requirement for string
, in fact the requirement that string
was contiguous was new to c++11. Before this const char* string::data()
had existed. Though it explicitly was not guaranteed to be pointing to any underlying buffer, it was the only way to obtain a const char*
from a string
:
The returned array is not required to be null-terminated.
This means that string
was not "shortchanged" in c++11's transition to data
accessors, it simply was not included thus only the const
data
accesor that string
previously possessed persisted. There are naturally occurring examples in C++11's implementation which necessitate writing directly to the underlying buffer of a string
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With