I've been having really freaky stuff happening in my code. I believe I have tracked it down to the part labeled "here" (code is simplified, of course):
std::string func() { char c; // Do stuff that will assign to c return "" + c; // Here }
All sorts of stuff will happen when I try to cout
the result of this function. I think I've even managed to get pieces of underlying C++ documentation, and many a segmentation fault. It's clear to me that this doesn't work in C++ (I've resorted to using stringstream
to do conversions to string
now), but I would like to know why. After using lots of C# for quite a while and no C++, this has caused me a lot of pain.
*&n is equivalent to n . Thus the value of n is printed out. The value of n is the address of the variable p that is a pointer to int .
The #define creates a macro, which is the association of an identifier or parameterized identifier with a token string. After the macro is defined, the compiler can substitute the token string for each occurrence of the identifier in the source file.
""
is a string literal. Those have the type array of N const char
. This particular string literal is an array of 1 const char
, the one element being the null terminator.
Arrays easily decay into pointers to their first element, e.g. in expressions where a pointer is required.
lhs + rhs
is not defined for arrays as lhs
and integers as rhs
. But it is defined for pointers as the lhs and integers as the rhs, with the usual pointer arithmetic.
char
is an integral data type in (i.e., treated as an integer by) the C++ core language.
==> string literal +
character therefore is interpreted as pointer +
integer.
The expression "" + c
is roughly equivalent to:
static char const lit[1] = {'\0'}; char const* p = &lit[0]; p + c // "" + c is roughly equivalent to this expression
You return a std::string
. The expression "" + c
yields a pointer to const char
. The constructor of std::string
that expects a const char*
expects it to be a pointer to a null-terminated character array.
If c != 0
, then the expression "" + c
leads to Undefined Behaviour:
For c > 1
, the pointer arithmetic produces Undefined Behaviour. Pointer arithmetic is only defined on arrays, and if the result is an element of the same array.
If char
is signed, then c < 0
produces Undefined Behaviour for the same reason.
For c == 1
, the pointer arithmetic does not produce Undefined Behaviour. That's a special case; pointing to one element past the last element of an array is allowed (it is not allowed to use what it points to, though). It still leads to Undefined Behaviour since the std::string
constructor called here requires its argument to be a pointer to a valid array (and a null-terminated string). The one-past-the-last element is not part of the array itself. Violating this requirement also leads to UB.
What probably now happens is that the constructor of std::string
tries to determine the size of the null-terminated string you passed it, by searching the (first) character in the array that is equal to '\0'
:
string(char const* p) { // simplified char const* end = p; while(*end != '\0') ++end; //... }
this will either produce an access violation, or the string it creates contains "garbage". It is also possible that the compiler assumes this Undefined Behaviour will never happen, and does some funny optimizations that will result in weird behaviour.
By the way, clang++3.5 emits a nice warning for this snippet:
warning: adding 'char' to a string does not append to the string [-Wstring-plus-int]
return "" + c; // Here ~~~^~~
note: use array indexing to silence this warning
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With