After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
The function c_str() returns a const pointer to a regular C string, identical to the current string. The returned string is null-terminated. Note that since the returned pointer is of type (C/C++ Keywords) const, the character data that c_str() returns cannot be modified.
The c_str() method converts a string to an array of characters with a null character at the end. The function takes in no parameters and returns a pointer to this character array (also called a c-string).
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to data
broke some code; keeping this change conservative was a way to minimize negative impact.
The c_str
function had so far been entirely identical to data
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replace c_str
by data
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.std::string programName; // ... if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) { // etc. } else { // handle error }
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.std::string programName; // ... if( !programName.empty() ) { char emptyString[] = {'\0'}; if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) { // etc. } else { // handle error } }
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.std::string programName; // ... if( !programName.empty() ) { char emptyString[] = {'\0'}; if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) { // etc. } else { // handle error } }
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With