Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bit wise '&' with signed vs unsigned operand

I faced an interesting scenario in which I got different results depending on the right operand type, and I can't really understand the reason for it.

Here is the minimal code:

#include <iostream> #include <cstdint>  int main() {     uint16_t check = 0x8123U;      uint64_t new_check = (check & 0xFFFF) << 16;      std::cout << std::hex << new_check << std::endl;      new_check = (check & 0xFFFFU) << 16;      std::cout << std::hex << new_check << std::endl;      return 0; } 

I compiled this code with g++ (gcc version 4.5.2) on Linux 64bit: g++ -std=c++0x -Wall example.cpp -o example

The output was:

ffffffff81230000

81230000

I can't really understand the reason for the output in the first case.

Why at some point would any of the temporal calculation results be promoted to a signed 64bit value (int64_t) resulting in the sign extension?

I would accept a result of '0' in both cases if a 16bit value is shifted 16 bits left in the first place and then promoted to a 64bit value. I also do accept the second output if the compiler first promotes the check to uint64_t and then performs the other operations.

But how come & with 0xFFFF (int32_t) vs. 0xFFFFU (uint32_t) would result in those two different outputs?

like image 375
Alex Lop. Avatar asked Aug 03 '16 07:08

Alex Lop.


People also ask

What bitwise means?

Bitwise is a level of operation that involves working with individual bits which are the smallest units of data in a computing system. Each bit has single binary value of 0 or 1. Most programming languages manipulate groups of 8, 16 or 32 bits. These bit multiples are known as bytes.

How does a bitwise or work?

The | (bitwise OR) in C or C++ takes two numbers as operands and does OR on every bit of two numbers. The result of OR is 1 if any of the two bits is 1. The ^ (bitwise XOR) in C or C++ takes two numbers as operands and does XOR on every bit of two numbers. The result of XOR is 1 if the two bits are different.


1 Answers

That's indeed an interesting corner case. It only occurs here because you use uint16_t for the unsigned type when you architecture use 32 bits for ìnt

Here is a extract from Clause 5 Expressions from draft n4296 for C++14 (emphasize mine):

10 Many binary operators that expect operands of arithmetic or enumeration type cause conversions ... This pattern is called the usual arithmetic conversions, which are defined as follows:
...
(10.5.3) — Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
(10.5.4) — Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.

You are in the 10.5.4 case:

  • uint16_t is only 16 bits while int is 32
  • int can represent all the values of uint16_t

So the uint16_t check = 0x8123U operand is converted to the signed 0x8123 and result of the bitwise & is still 0x8123.

But the shift (bitwise so it happens at the representation level) causes the result to be the intermediate unsigned 0x81230000 which converted to an int gives a negative value (technically it is implementation defined, but this conversion is a common usage)

5.8 Shift operators [expr.shift]
...
Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value;...

and

4.7 Integral conversions [conv.integral]
...
3 If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.

(beware this was true undefined behaviour in C++11...)

So you end with a conversion of the signed int 0x81230000 to an uint64_t which as expected gives 0xFFFFFFFF81230000, because

4.7 Integral conversions [conv.integral]
...
2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).

TL/DR: There is no undefined behaviour here, what causes the result is the conversion of signed 32 bits int to unsigned 64 bits int. The only part part that is undefined behaviour is a shift that would cause a sign overflow but all common implementations share this one and it is implementation defined in C++14 standard.

Of course, if you force the second operand to be unsigned everything is unsigned and you get evidently the correct 0x81230000 result.

[EDIT] As explained by MSalters, the result of the shift is only implementation defined since C++14, but was indeed undefined behaviour in C++11. The shift operator paragraph said:

...
Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

like image 168
Serge Ballesta Avatar answered Sep 29 '22 06:09

Serge Ballesta