I'm using a C library in C++ and wrote a wrapper. At one point I need to convert an std::string
to a c-style string. There is a class with a function, which returns a string. Casting the returned string works if the string is short, otherwise not. Here is a simple and reduced example illustrating the issue:
#include <iostream>
#include <string>
class StringBox {
public:
std::string getString() const { return text_; }
StringBox(std::string text) : text_(text){};
private:
std::string text_;
};
int main(int argc, char **argv) {
const unsigned char *castString = NULL;
std::string someString = "I am a loooooooooooooooooong string"; // Won't work
// std::string someString = "hello"; // This one works
StringBox box(someString);
castString = (const unsigned char *)box.getString().c_str();
std::cout << "castString: " << castString << std::endl;
return 0;
}
Executing the file above prints this to the console:
castString:
whereas if I swap the commenting on someString
, it correctly prints
castString: hello
How is this possible?
You are invoking c_str
on a temporary string object retuned by the getString()
member function. The pointer returned by c_str()
is only valid as long as the original string object exists, so at the end of the line where you assign castString
it ends up being a dangling pointer. Officially, this leads to undefined behavior.
So why does this work for short strings? I suspect that you're seeing the effects of the Short String Optimization, an optimization where for strings less than a certain length the character data is stored inside the bytes of the string object itself rather than in the heap. It's possible that the temporary string that was returned was stored on the stack, so when it was cleaned up no deallocations occurred and the pointer to the expired string object still holds your old string bytes. This seems consistent with what you're seeing, but it still doesn't mean what you're doing is a good idea. :-)
box.getString()
is an anonymous temporary. c_str()
is only valid for the length of the variable.
So in your case, c_str()
is invalidated by the time you get to the std::cout
. The behaviour of reading the pointer contents is undefined.
(Interestingly the behaviour of your short string is possibly different due to std::string
storing short strings in a different way.)
As you return by value
box.getString()
is a temporary and so
box.getString().c_str()
is valid only during the expression, then it is a dangling pointer.
You may fix that with
const std::string& getString() const { return text_; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With