Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C: What happens (in detail) in x=~x if x is of type char?

If we have the following code:

char x = -1;
x =~x;

On an x86 platform with MS VS compiler (which partly supports C99) - what happens in detail when it is running?

To my knowledge, the following happens (please correct me if I am wrong):

  • x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.
  • The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.
  • The ~ operator's result is 0x00000000 (of type int).
  • To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.
  • As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.

There are so many things that actually happen - I find it somehow confusing. In particular: Is my understanding of the last implicit conversion (of int to char) correct? What would happen if the assignment's result could not be stored in a char?

like image 861
maya Avatar asked Dec 13 '22 15:12

maya


2 Answers

Indeed ~x is an int type.

The conversion back to char is well-defined if char is unsigned. It's also well-defined, of course, if the value is in the range supported by char.

If char is signed, then the conversion of ~x to char is implementation-defined, with the possibility that an implementation defined signal is raised.

In your case, you have a platform with a 2's complement int and a 2's complement char, and so ~x is observed as 0.

Note that MSVC doesn't fully support any C standard, and neither does it claim to.

like image 59
Bathsheba Avatar answered Dec 22 '22 01:12

Bathsheba


You are almost correct, but missing out that char has implementation-defined signedness. It can either be signed or unsigned, depending on compiler.

In either case, the bit pattern for a 8 bit 2's complement char is indeed 0xFF regardless of its signedness. But in case the char is signed, integer promotion will preserve the sign and you still have value -1, binary 0xFFFFFFFF on a 32 bit computer. But if char was unsigned, -1 would have been converted to 255 upon assignment and integer promotion would have given 255 (0x000000FF). So you'd get a different result.

Regarding integer promotion of ~, it only has one operator to the right and that one is promoted.

Finally you assign the result back to char and the outcome will again depend on signedness. You'll have an implicit "lvalue conversion" upon assignment from int to char. The result is implementation-defined - most likely you get the least significant byte of the int.


From this we can learn:

  • Never use char for storing integer values or for arithmetic. Use it for storing characters only. Instead, use uint8_t.
  • Never perform bitwise arithmetic on operands that are potentially signed, or was made signed silently through implicit promotion.
  • The ~ operator is particularly dangerous unless the operand is unsigned int or a larger unsigned type.
like image 33
Lundin Avatar answered Dec 22 '22 00:12

Lundin