Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying underlying char array of a c++ string object

Tags:

c++

arrays

string

My code is like this:

string s = "abc";
char* pc = const_cast<char*>( s.c_str() );
pc[ 1 ] = 'x';
cout << s << endl;

When I compiled the snippet above using GCC, I got the result "axc" as expected. My question is, is that safe and portable to modify the underlying char array of a C++ string in this way? Or there might be alternative approaches to manipulate string's data directly?

FYI, my intention is to write some pure C functions that could be called both by C and C++, therefore, they can only accept char* as arguments. From char* to string, I know there is copying involved, the penalty is unfavorable. So, could anybody give some suggestions to deal with this sort of situation.

like image 263
Need4Steed Avatar asked Apr 20 '11 11:04

Need4Steed


2 Answers

The obvious answer is no, it's undefined behavior. On the other hand, if you do:

char* pc = &s[0];

you can access the underlying data, in practice today, and guaranteed in C++11.

like image 145
James Kanze Avatar answered Oct 16 '22 15:10

James Kanze


To the first part, c_str() returns const char* and it means what it says. All the const_cast achieves in this case is that your undefined behavior compiles.

To the second part, in C++0x std::string is guaranteed to have contiguous storage, just like std::vector in C++03. Therefore you could use &s[0] to get a char* to pass to your functions, as long as the string isn't empty. In practice, all string implementations currently in active development already have contiguous storage: there was a straw poll at a standard committee meeting and nobody offered a counter-example. So you can use this feature now if you like.

However, std::string uses a fundamentally different string format from C-style strings, namely it's data+length rather than nul-terminated. If you modify the string data from your C functions, then you can't change the length of the string and you can't be sure there's a nul byte at the end without c_str(). And std::string can contain embedded nuls which are part of the data, so even if you did find a nul, without knowing the length you still don't know that you've found the end of the string. You're very limited what you can do in functions that will operate correctly on both different kinds of data.

like image 21
Steve Jessop Avatar answered Oct 16 '22 15:10

Steve Jessop