Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

binary numbers: strange differences between + and |

Tags:

c

I'm trying to manipulating binary numbers with c. I found a strange thing with the minimum code below. Can anyone tell me what is the difference between "+" and "|" here? Thank you!

char next_byte1 = 0b11111111;
char next_byte2 = 0b11110101;
short a = (next_byte1 << 8) | next_byte2;
short b = (next_byte1 << 8) + next_byte2;
printf("a vs b is %d ~ %d.\n", a, b);

It showed: a vs b is -11 ~ -267, which is 0b11111111 11110101 and 0b11111110 11110101. I'm very confused with this result.

like image 708
WSnow Avatar asked Dec 10 '22 00:12

WSnow


2 Answers

The problem you are seeing is because next_byte2 is sign-extended to a full int before doing the bitwise operation and thus is "corrupting" the high byte.

When doing bit manipulation it's better to use unsigned types (that is actually what unsigned are to be used for). Plain char types can be (and normally are) signed types and thus are better avoided for these uses.

like image 152
6502 Avatar answered Dec 28 '22 18:12

6502


  • You should never use char for binary/bitwise arithmetic, because it has implementation-defined signedness and might be negative. In general, use stdint.h over the default C types.
  • In case char is signed, then the value inside it ends up converted to -1 in two's complement during the variable initialization. This happens to next_byte1 and next_byte2 both.
  • Whenever you use a small integer type inside an expression, it is usually promoted to signed int. So your -1 (0xFF) gets changed to -1 (0xFFFFFFF) before you left shift.
  • Left shifting a negative operand is undefined behavior, meaning that any kind of bugs may rain all over your program. This happens in this case, so no results are guaranteed.
  • Apparently in your case, the undefined behavior manifested itself as you ending up with a large negative number with the binary representation 0xFFFFFF00.
  • The difference between | and + is that the latter cares about sign, so in case of + you end up adding negative numbers together, but in case of | the binary representations are simply OR:ed together.

You can fix the program in the following way:

#include <stdio.h>
#include <stdint.h>

int main(void)
{
  uint8_t next_byte1 = 0xFF;
  uint8_t next_byte2 = 0xF5;
  uint16_t a = (next_byte1 << 8) | next_byte2;
  uint16_t b = (next_byte1 << 8) + next_byte2;
  printf("a vs b is %d ~ %d.\n", a, b);
}

And now | and + work identically, as intended.

like image 35
Lundin Avatar answered Dec 28 '22 19:12

Lundin