Let's take for example the following two 1-byte variables:
uint8_t x1 = 0x00;
uint8_t x2 = 0xFF;
When printing the bitwise complement, the result is a 4-byte variable:
printf("%02X -> %02X; %02X -> %02X\n", x1, ~x1, x2, ~x2);
00 -> FFFFFFFF; FF -> FFFFFF00
I know this can be "solved" using casting or masking:
printf("%02X -> %02X; %02X -> %02X\n", x1, (uint8_t) ~x1, x2, (uint8_t) ~x2);
00 -> FF; FF -> 00
printf("%02X -> %02X; %02X -> %02X\n", x1, ~x1&0xFF, x2, ~x2&0xFF);
00 -> FF; FF -> 00
But why the non-intuitive behavior in the first place?
Many computer processors have a “word” size for most of their operations. E.g., on a 32-bit machine, there may be an instruction that loads 32 bits, an instruction that stores 32 bits, an instruction that adds one 32-bit number to another, and so on.
On these processors, it may be a nuisance to work with other sizes. There may be no instruction for multiplying a 16-bit number by another 16-bit number. C grew up on these machines. It was designed so that int (or unsigned int) was “whatever size is good for the machine you are running on” and char or short were fine for storing things in memory, but, once they were loaded from memory into processor registers, C worked with them like they were int.
This simplified the development of early C compilers. The compiler did not have to implement your complement by doing a 32-bit complement instruction followed by an AND instruction to remove the unwanted high bits. It only did a plain 32-bit complement.
We could develop languages differently today, but C is burdened with this legacy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With