Consider the following code snippet.
#include <stdio.h>
typedef struct s {
int _;
char str[];
} s;
s first = { 0, "abcd" };
int main(int argc, const char **argv) {
s second = first;
printf("%s\n%s\n", first.str, second.str);
}
When I compile this with GCC 7.2, I get:
$ gcc-7 -o tmp tmp.c && ./tmp
abcd
abcd
But when I compile this with Clang (Apple LLVM version 8.0.0 (clang-800.0.42.1)), I get the following:
$ clang -o tmp tmp.c && ./tmp
abcd
# Nothing here
Why does the output differ between the compilers? I would expect the string not to be copied, as it's a flexible array member (similar to this question). Why does GCC actually copy it?
Edit
Some comments and an answer suggested this might be due to optimization. GCC may make second
an alias of first
, so updating second
should disallow GCC from doing that optimization. I added the line:
second._ = 1;
But this doesn't change the output.
Here's the real answer of what's going on with gcc. second
is allocated on the stack, just as you'd expect. It is not an alias for first
. This is easily verified by printing their addresses.
Additionally, the declaration s second = first;
is corrupting the stack, because (a) gcc is allocating the minimum amount of storage for second
but (b) it is copying all of first
into second, corrupting the stack.
Here is a modified version of the original code which shows this:
#include <stdio.h>
typedef struct s {
int _;
char str[];
} s;
s first = { 0, "abcdefgh" };
int main(int argc, const char **argv) {
char v[] = "xxxxxxxx";
s second = first;
printf("%p %p %p\n", (void *) v, (void *) &first, (void *) &second);
printf("<%s> <%s> <%s>\n", v, first.str, second.str);
}
On my 32-bit Linux machine, with gcc, I get the following output:
0xbf89a303 0x804a020 0xbf89a2fc
<defgh> <abcdefgh> <abcdefgh>
As you can see from the addresses, v
and second
are on the stack, and first
is in the data section. Further, it is also clear that the initialization of second
has overwritten v
on the stack, with the result that instead of the expected <xxxxxxxx>
, it is instead showing <defgh>
.
This seems like a gcc bug to me. At the very least, it should warn that the initialization of second
will corrupt the stack, since it clearly has enough information to know this at compile time.
Edit: I tested this some more, and obtained essentially equivalent results by splitting the declaration of second
into:
s second;
second = first;
The real problem is the assignment. It's copying all of first
, rather than the minimal common part of the structure type, which is what I believe it should do. In fact, if you move the static initialization of first
into a separate file, the assignment does what it should do, v
prints correctly, and second.str
is undefined garbage. This is the behavior gcc should be producing, regardless of whether the initialization of first
is visible in the same compilation unit or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With