Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does ((unsigned char)0x80) << 24 get sign extended to 0xFFFFFFFF80000000 (64-bit)?

The following program

#include <inttypes.h> /*  printf(" %" PRIu32 "\n"), my_uint32_t) */ #include <stdio.h> /* printf(), perror() */  int main(int argc, char *argv[]) {   uint64_t u64 = ((unsigned char)0x80) << 24;   printf("%"  PRIX64 "\n", u64);    /* uint64_t */ u64 = ((unsigned int)0x80)  << 24;   printf("%016"  PRIX64 "\n", u64); } 

produces

FFFFFFFF80000000 0000000080000000 

What is the difference between ((unsigned char)0x80) and ((unsigned int)0x80) in this context?

I guess that (unsigned char)0x80 gets promoted to (unsigned char)0xFFFFFFFFFFFFFF80 and then is bit shifted, but why does this conversion think that unsigned char is signed?

It's also interesting to note that 0x80 << 16 produces the expected result, 0x0000000000800000.

like image 318
RubenLaguna Avatar asked Apr 09 '15 12:04

RubenLaguna


2 Answers

C compiler performs integer promotions before executing the shift.

Rule 6.3.1.1 of the standard says:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

Since all values of unsigned char can be represented by int, 0x80 gets converted to a signed int. The same is not true about unsigned int: some of its values cannot be represented as an int, so it remains unsigned int after applying integer promotions.

like image 196
Sergey Kalinichenko Avatar answered Oct 05 '22 19:10

Sergey Kalinichenko


The left operand of the << operator undergoes integer promotion.

(C99, 6.5.7p3) "The integer promotions are performed on each of the operands."

It means this expression:

 ((unsigned char)0x80) << 24 

is equivalent to:

 ((int) (unsigned char)0x80) << 24 

equivalent to:

  0x80 << 24 

which set the sign bit of an int in a 32-bit int system. Then when 0x80 << 24 is converted to uint64_t in the u64 declaration the sign extension occurs to yield the value 0xFFFFFFFF80000000.

EDIT:

Note that as Matt McNabb correctly added in the comments, technically 0x80 << 24 invokes undefined behavior in C as the result is not representable in the type of the << left operand. If you are using gcc, the current compiler version guarantees that it does not currently make this operation undefined.

like image 37
ouah Avatar answered Oct 05 '22 19:10

ouah