Consider the following code: <pre class="prettyprint"><code>std::vector<std::string> foo{{"blee"}, {"bleck"}, {"blah0000000000000000000000000000000000000000000000000000000000000000000000000000000000"}}; std::string *temp = foo.data(); char*** bar = reinterpret_cast<char***>(&temp); for (size_t i = 0; i < foo.size(); ++i){ std::cout << (*bar)[i] << std::endl; } </code></pre> Clearly this is sketchy code, but it happens to work. http://ideone.com/2XAJYR I would like to know why it works? Are there some strange rules of C++ I don't know about? Or is it just bad code and undefined behaviour? I made one of the strings huge in case there was some small-string optimization going on. Adapted from: Cast a vector of std::string to char***

It is very much undefined behaviour. It will appear to "work" if the string implementation happens to contain a pointer to the string data as its only data member, so that an array of <code>string</code> has the same memory layout as an array of <code>char*</code>. That is the case for at least one popular implementation (GNU), but is certainly not something you can rely on.

After Neil Kirk mentioned this in a comment on the answer that originally sparked all this, I looked it up. <code>string</code> is a specialization of <code>basic_string</code> on all implementations. Now I only have access to Visual Studio's 2013 version of xstring.h (here Microsoft implements <code>basic_string</code>) so this may be different for other versions or compilers. But in xstring.h <code>basic_string</code> inherits from <code>_String_alloc</code> which inherits from <code>_String_val</code>. <code>_String_val</code> is actually the first in the inheritance chain which has any member variables. It's first member variable, <code>_Bx</code>, is a <code>union</code> which will translate to a <code>char*</code> for <code>string</code> (not for <code>wstring</code>). So when a <code>string</code> is cast to a <code>char*</code> on Visual Studio 2013 it is a <code>char*</code> which begins pointing to the member variable: <code>_Bx</code> Since <code>_Bx</code> is actually a <code>'\0'</code>-terminated <code>char*</code> you can <code>cout</code> it and it behave's properly. Now what I didn't know, and what all this research taught me, is that <code>_String_val</code> also contains a size variable, <code>_Mysize</code>, and a reserved size, <code>_Myres</code>. If either of those had been declared in <code>_String_val</code> before <code>_Bx</code> this would have outputted gibberish at the start of <code>cout</code>'s output each line. I'd conclude by conceding that as is mentioned by the other answers this behavior is implementation dependent, and may not work across diferent versions or platforms.

Converting std::string to char * and it happens to work. How?

Tags:

c++

Consider the following code:

std::vector<std::string> foo{{"blee"}, {"bleck"}, {"blah0000000000000000000000000000000000000000000000000000000000000000000000000000000000"}};
std::string *temp = foo.data();
char*** bar = reinterpret_cast<char***>(&temp);

for (size_t i = 0; i < foo.size(); ++i){
    std::cout << (*bar)[i] << std::endl;
}

Clearly this is sketchy code, but it happens to work.

http://ideone.com/2XAJYR

I would like to know why it works? Are there some strange rules of C++ I don't know about? Or is it just bad code and undefined behaviour?

I made one of the strings huge in case there was some small-string optimization going on.

Adapted from: Cast a vector of std::string to char***

536

asked Mar 13 '15 12:03

Neil Kirk

3 Answers

It is very much undefined behaviour.

It will appear to "work" if the string implementation happens to contain a pointer to the string data as its only data member, so that an array of string has the same memory layout as an array of char*. That is the case for at least one popular implementation (GNU), but is certainly not something you can rely on.

112

answered Oct 23 '22 02:10

Mike Seymour

The behaviour depends on your STL implementation (just revise std::vector and std::string source code). Occasionaly, you have the string impl that stores (as other participants mentioned) pointer to chars buffer as a member.

It's not a secret that one shoudn't rely on incapsulated details of implementation due to undefined behaviour it causes.

answered Oct 23 '22 00:10

Michael Grigoriev

After Neil Kirk mentioned this in a comment on the answer that originally sparked all this, I looked it up.

string is a specialization of basic_string on all implementations.

Now I only have access to Visual Studio's 2013 version of xstring.h (here Microsoft implements basic_string) so this may be different for other versions or compilers. But in xstring.h basic_string inherits from _String_alloc which inherits from _String_val.

_String_val is actually the first in the inheritance chain which has any member variables. It's first member variable, _Bx, is a union which will translate to a char* for string (not for wstring).

So when a string is cast to a char* on Visual Studio 2013 it is a char* which begins pointing to the member variable: _Bx Since _Bx is actually a '\0'-terminated char* you can cout it and it behave's properly.

Now what I didn't know, and what all this research taught me, is that _String_val also contains a size variable, _Mysize, and a reserved size, _Myres. If either of those had been declared in _String_val before _Bx this would have outputted gibberish at the start of cout's output each line.

I'd conclude by conceding that as is mentioned by the other answers this behavior is implementation dependent, and may not work across diferent versions or platforms.

answered Oct 23 '22 02:10

Jonathan Mee

Related questions
                            
                                Compiling multiple .cpp and .h files using g++. Am I doing it right?
                            
                                Does the C++ standard allow for an implementation to coalesce allocations?
                            
                                C++ How to correctly round a const float to unsigned int
                            
                                Why does `++a++` not compile in C++ but `(++a)++` does? [duplicate]
                            
                                GCC: "__unused__" vs just "unused" in variable attributes
                            
                                Unknown return type in template
                            
                                Triangulation between polygons on different planes
                            
                                How to change the current working directory?
                            
                                When to use explicit specifier for multi-argument constructors?
                            
                                static_cast from 'const unsigned char *const *' to 'const char *const *' is not allowed
                            
                                Is a cassandra session thread safe? (using cpp driver)
                            
                                How to constrain the signature of callable objects in C++?
                            
                                Convert string with thousands (and decimal) separator into double
                            
                                Binary compatibility when using pass-by-reference instead of pass-by-pointer
                            
                                Extract set bytes position from SIMD vector
                            
                                STL for segment tree in C++
                            
                                Reference invalidation after applying reverse_iterator on a custom made iterator
                            
                                AFAIK, the code below shouldn't compile, but it does in clang and GCC. What am I missing here?
                            
                                What is the (searchable) name for this syntax...?
                            
                                Generating N choose K Permutations in C++ [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Converting std::string to char * and it happens to work. How?

Tags:

c++

Neil Kirk

People also ask

3 Answers

Mike Seymour

Michael Grigoriev

Jonathan Mee

Recent Activity

Donate For Us

Converting std::string ** to char *** and it happens to work. How?

Tags:

c++

Neil Kirk

People also ask

3 Answers

Mike Seymour

Michael Grigoriev

Jonathan Mee

Related questions

Recent Activity

Donate For Us

Converting std::string to char * and it happens to work. How?