In our C++ code, we have our own string class (for legacy reasons). It supports a method c_str()
much like std::string
. What I noticed is that many developers are using it incorrectly. I have reduced the problem to the following line:
const char* x = std::string("abc").c_str();
This seemingly innocent code is quite dangerous in the sense that the destructor on std::string
gets invoked immediately after the call to c_str()
. As a result, you are holding a pointer to a de-allocated memory location.
Here is another example:
std::string x("abc");
const char* y = x.substr(0,1).c_str();
Here too, we are using a pointer to de-allocated location.
These problems are not easy to find during testing as the memory location still contains valid data (although the memory location itself is invalid).
I am wondering if you have any suggestions on how I can modify class/method definition such that developers can never make such a mistake.
The modern part of the code should not deal with raw pointers like that.
Call c_str
only when providing an argument to a legacy function that takes const char*
. Like:
legacy_print(x.substr(0,1).c_str())
Why would you want to create a local variable of type const char*
? Even if you write a copying version c_str_copy()
you will just get more headache because now the client code is responsible for deleting the resulting pointer.
And if you need to keep the data around for a longer time (e.g. because you want to pass the data to multiple legacy functions) then just keep the data wrapped in a string instance the whole time.
For the basic case, you can add a ref qualifier on the "this" object, to make sure that .c_str() is never immediately called on a temporary. Of course, this can't stop them from storing in a variable that leaves scope before the pointer does.
const char *c_str() & { return ...; }
But the bigger-picture solution is to replace all functions from taking a "const char *" in your codebase with functions that take one of your string classes (at the very least, you need two: an owning string and a borrowed slice) - and make sure that none of your string class does cannot be implicitly constructed from a "const char *".
The simplest solution would be to change your destructor to write a null at the beginning of the string at destruction time. (Alternatively, fill the entire string with an error message or 0's; you can have a flag to disable this for release code.)
While it doesn't directly prevent programmers from making the mistake of using invalid pointers, it will definitely draw attention to the problem when the code doesn't do what it should do. This should help you flush out the problem in your code.
(As you mentioned, at the moment the errors go unnoticed because for the most part the code will happily run with the invalid memory.)
Consider using Valgrind or Electric Fence to test your code. Either of these tools should trivially and immediately find these errors.
I am not sure that there is much you can do about people using your library incorrectly if you warn them about it. Consider the actual stl string
library. If i do this:
const char * lala = std::string("lala").c_str();
std::cout << lala << std::endl;
const char * lala2 = std::string("lalb").c_str();
std::cout << lala << std::endl;
std::cout << lala2 << std::endl;
I am basically creating undefined behavior. In the case where i run it on ideone.com i get the following output:
lala
lalb
lalb
So clearly the memory of the original lala
has been overwritten. I would just make it very clear to the user in the documentation that this sort of coding is bad practice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With