Why the output of this program is 4
?
#include <iostream>
int main()
{
short A[] = {1, 2, 3, 4, 5, 6};
std::cout << *(short*)((char*)A + 7) << std::endl;
return 0;
}
From my understanding, on x86 little endian system, where char has 1 byte, and short 2 bytes, the output should be 0x0500
, because the data in array A
is as fallow in hex:
01 00 02 00 03 00 04 00 05 00 06 00
We move from the beginning 7 bytes forward, and then read 2 bytes. What I'm missing?
You are violating strict aliasing rules here. You can't just read half-way into an object and pretend it's an object all on its own. You can't invent hypothetical objects using byte offsets like this. GCC is perfectly within its rights to do crazy sh!t like going back in time and murdering Elvis Presley, when you hand it your program.
What you are allowed to do is inspect and manipulate the bytes that make up an arbitrary object, using a char*
. Using that privilege:
#include <iostream>
#include <algorithm>
int main()
{
short A[] = {1, 2, 3, 4, 5, 6};
short B;
std::copy(
(char*)A + 7,
(char*)A + 7 + sizeof(short),
(char*)&B
);
std::cout << std::showbase << std::hex << B << std::endl;
}
// Output: 0x500
But you can't just "make up" a non-existent object in the original collection.
Furthermore, even if you have a compiler that can be told to ignore this problem (e.g. with GCC's -fno-strict-aliasing
switch), the made-up object is not correctly aligned for any current mainstream architecture. A short
cannot legally live at that odd-numbered location in memory†, so you doubly can't pretend there is one there. There's just no way to get around how undefined the original code's behaviour is; in fact, if you pass GCC the -fsanitize=undefined
switch it will tell you as much.
† I'm simplifying a little.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With