I have a very simple C program where I am printing variables of different sizes.
#include <stdio.h>
unsigned int long long a;
unsigned int c;
int main() {
a = 0x1111111122222222;
c = 0x33333333;
printf("Sizes: %zu %zu\n", sizeof(a), sizeof(c));
printf("Seg: %llx %x\n", a, c);
printf("Seg: %lx %x\n", a, c);
printf("Seg: %x\n", c);
return 0;
}
On a 64-bit machine, all works fine.
On a 32-bit machine though, if I use an incorrect formatter for the first argument (second printf), I get incorrect output for second argument too. Is that because of how varargs are processed? What am I missing?
Output
Sizes: 8 4
Seg: 1111111122222222 33333333
Seg: 22222222 11111111
Seg: 33333333
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, not stripped
Command used
rm ./a.out ; g++ -m32 test.cpp ; ./a.out ; file a.out
The second printf where the conversion format does not match the actual type passed has undefined behavior, so the C Standard imposes no requirements and anything you observe is possible, including expected behavior, as you observe on 64-bit targets and different behavior as you want to analyse on your 32-bit target.
Here is a tentative explanation (assuming little endian and stack based ABI):
unsigned long long uses 2 stack slots in little endian orderunsigned int uses a single slotunsigned long, expected by printf for %lx has 32 bits (as is the case on most targets), so it uses a single slotSeg: 22222222 11111111Remember that undefined behavior is by definition not defined, so other side effects may happen and something completely different may be output, even on the same host with the same binary.
As commented by @GlennWillen, in particular, you cannot count on the damage being limited to a particular line of code, or function, or file; and you cannot count on it being limited to consequences which would actually be possible in any straightforward interpretation of the code; and you cannot count on it being limited to things happening chronologically after the UB in many cases. If the code could execute UB when some condition is true, the compiler may act as though that condition is false everywhere and for all purposes, even if this is absurd or contradictory.
Here are some take-aways:
-Wall -Wextra or similar).-Werror will force this) and think twice before using casts to silence a warning.<inttypes.h> for types defined in <stdint.h>By passing the incorrect formatter, you lied to printf.
printf does not know the types or sizes of the arguments you actually pass to it; it just sees an undifferentiated sludge of bytes on the call stack (or in whatever registers). After the format string, it doesn't know where one argument stops and the next one starts. It only knows what to expect based on the format string.
In the incorrect printf call, you told it that the first argument was sizeof (unsigned long int) bytes, so it read that many bytes and did the appropriate conversion. Then it read the next sizeof (unsigned long int) bytes and converted that for the next output. It doesn't know it was reading two halves of a single object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With