Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does it make a difference if left and right shift are used together in one expression or not?

I have the following code:

unsigned char x = 255;
printf("%x\n", x); // ff

unsigned char tmp = x << 7;
unsigned char y = tmp >> 7;
printf("%x\n", y); // 1

unsigned char z = (x << 7) >> 7;
printf("%x\n", z); // ff

I would have expected y and z to be the same. But they differ depending on whether a intermediary variable is used. It would be interesting to know why this is the case.

like image 763
odzhychko Avatar asked May 22 '20 15:05

odzhychko


People also ask

What is the difference between right and left shift?

The left and right Shift key on a computer keyboard perform the same function. When pressed and held down, it changes the case of the letter to uppercase, or use the alternate character on any other key. For example, pressing and holding down the Shift key while pressing the letter "a" makes a capital "A".

What is the use of Left Shift and Right Shift in C?

Left Shift and Right Shift Operators in C/C++ Takes two numbers, left shifts the bits of the first operand, the second operand decides the number of places to shift. Or in other words left shifting an integer “x” with an integer “y” denoted as '(x<<y)' is equivalent to multiplying x with 2^y (2 raised to power y).

Which operation is performed while shifting left and right?

The bitwise shift operators are the right-shift operator ( >> ), which moves the bits of an integer or enumeration type expression to the right, and the left-shift operator ( << ), which moves the bits to the left.

What is left shift used for?

The left shift operator ( << ) shifts the first operand the specified number of bits, modulo 32, to the left. Excess bits shifted off to the left are discarded. Zero bits are shifted in from the right.


2 Answers

This little test is actually more subtle than it looks as the behavior is implementation defined:

  • unsigned char x = 255; no ambiguity here, x is an unsigned char with value 255, type unsigned char is guaranteed to have enough range to store 255.

  • printf("%x\n", x); This produces ff on standard output but it would be cleaner to write printf("%hhx\n", x); as printf expects an unsigned int for conversion %x, which x is not. Passing x might actually pass an int or an unsigned int argument.

  • unsigned char tmp = x << 7; To evaluate the expression x << 7, x being an unsigned char first undergoes the integer promotions defined in the C Standard 6.3.3.1: If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

    So if the number of value bits in unsigned char is smaller or equal to that of int (the most common case currently being 8 vs 31), x is first promoted to an int with the same value, which is then shifted left by 7 positions. The result, 0x7f80, is guaranteed to fit in the int type, so the behavior is well defined and converting this value to type unsigned char will effectively truncate the high order bits of the value. If type unsigned char has 8 bits, the value will be 128 (0x80), but if type unsigned char has more bits, the value in tmp can be 0x180, 0x380, 0x780, 0xf80, 0x1f80, 0x3f80 or even 0x7f80.

    If type unsigned char is larger than int, which can occur on rare systems where sizeof(int) == 1, x is promoted to unsigned int and the left shift is performed on this type. The value is 0x7f80U, which is guaranteed to fit in type unsigned int and storing that to tmp does not actually lose any information since type unsigned char has the same size as unsigned int. So tmp would have the value 0x7f80 in this case.

  • unsigned char y = tmp >> 7; The evaluation proceeds the same as above, tmp is promoted to int or unsigned int depending on the system, which preserves its value, and this value is shifted right by 7 positions, which is fully defined because 7 is less than the width of the type (int or unsigned int) and the value is positive. Depending on the number of bits of type unsigned char, the value stored in y can be 1, 3, 7, 15, 31, 63, 127 or 255, the most common architecture will have y == 1.

  • printf("%x\n", y); again, it would be better t write printf("%hhx\n", y); and the output may be 1 (most common case) or 3, 7, f, 1f, 3f, 7f or ff depending on the number of value bits in type unsigned char.

  • unsigned char z = (x << 7) >> 7; The integer promotion is performed on x as described above, the value (255) is then shifted left 7 bits as an int or an unsigned int, always producing 0x7f80 and then right shifted by 7 positions, with a final value of 0xff. This behavior is fully defined.

  • printf("%x\n", z); Once more, the format string should be printf("%hhx\n", z); and the output would always be ff.

Systems where bytes have more than 8 bits are becoming rare these days, but some embedded processors, such as specialized DSPs still do that. It would take a perverse system to fail when passed an unsigned char for a %x conversion specifier, but it is cleaner to either use %hhx or more portably write printf("%x\n", (unsigned)z);

Shifting by 8 instead of 7 in this example would be even more contrived. It would have undefined behavior on systems with 16-bit int and 8-bit char.

like image 173
chqrlie Avatar answered Oct 23 '22 10:10

chqrlie


The 'intermediate' values in your last case are (full) integers, so the bits that are shifted 'out of range' of the original unsigned char type are retained, and thus they are still set when the result is converted back to a single byte.

From this C11 Draft Standard:

6.5.7 Bitwise shift operators
...
3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand ...

However, in your first case, unsigned char tmp = x << 7;, the tmp loses the six 'high' bits when the resultant 'full' integer is converted (i.e. truncated) back to a single byte, giving a value of 0x80; when this is then right-shifted in unsigned char y = tmp >> 7;, the result is (as expected) 0x01.

like image 29
Adrian Mole Avatar answered Oct 23 '22 11:10

Adrian Mole