I am confused by some behavior of size_t
that I noticed:
size_t zero = 0x1 << 32;
size_t big = 0x1 << 31;
size_t not_as_big = 0x1 << 30;
printf("0x1<<32: %zx\n0x1<<31: %zx\n0x1<<30: %zx\n", zero, big, not_as_big);
Results in:
0x1<<32: 0
0x1<<31: ffffffff80000000
0x1<<30: 40000000
Now, I understand that size_t
is only guaranteed to be at minimum a 16 bit unsigned integer, but I don't understand why 0x1<<31
ends up the value it did - trying to allocate 18 exabytes did a number on my program.
I'm using LLVM on x86_64.
Shifting a signed integer so that 1's get into the sign bit position or even further is undefined in C, so the compiler is free to do the following:
0x1 << 32
Here the compiler sees a 32-bit int (0x1) which is shifted by 32 bits. Since the compiler is free to interpret it in a way that is consistent with more a correct shift, it interprets it as 0x1_0000_0000
and tries to convert it to 32-bit int, resulting in 0x0000_0000
, and then sees that you later assign the result to a size_t
, which is usually 64-bit: 0x0000_0000_0000_0000
0x1 << 31
As before, the compiler is free to do whatever it seems right with it, because the 1 bit invades the sign bit position. So the result is 0x8000_0000
, which is a negative number - INT_MIN
to be precise. Then, it sees you convert that negative number to 64 bits, so it extends it with ones, as with all negative numbers. The result is 0xffff_ffff_8000_0000
, the smallest 32-bit signed integer stored as a signed 64-bit integer.
The correct and portable between all 64-bit platforms way to do it is:
((size_t)1) << 32
((size_t)1) << 31
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With