Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does C++20 well-define left shift for signed integers that "overflow"?

In the current C++ Standard Draft, the left shift operator is defined as follows [expr.shift]:

The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N, where N is the width of the type of the result.

Consider int E1 = 2^31-1 = 2'147'483'647, E2 = 1, and int having 32 bits. Then there is an infinite number of numbers congruent to E1×2^E2 = 4'294'967'294 modulo 2^N = 2^32, namely, all the numbers 4'294'967'294 + k×2^32 where k is an arbitrary integer. Examples are 4'294'967'294 (k=0) or -2 (k=-1).

I don't understand what the Standard means by the unique value out of these numbers. Does it mean the unique value that can be represented by the resulting data type? Then, I suppose the result is defined as -2. Is this interpretation correct?

Until C++20, the definition was different and this case would cause undefined behavior. I suppose the change is related to the mandatory 2's-complement representation of negative signed integers.

In fact, there is now no more requirement for E1 to be non-negative. It therefore seems that -1 << 1 is defined as -2. Is that right as well?

like image 776
Daniel Langr Avatar asked Apr 02 '19 07:04

Daniel Langr


People also ask

What happens when a signed integer overflows?

In contrast, the C standard says that signed integer overflow leads to undefined behavior where a program can do anything, including dumping core or overrunning a buffer. The misbehavior can even precede the overflow. Such an overflow can occur during addition, subtraction, multiplication, division, and left shift.

How do you overcome signed integer overflow?

Use 64-bits integers. One very good way to prevent integer overflows is to use int64_t to implement integers. In most case, 64-bits ints will not commit overflow, unlike their 32-bits counterparts. There is actually very few downsides in using int64_t instead of int32_t .

Which left shift operation works on signed numbers?

For signed numbers, the sign bit is used to fill the vacated bit positions. In other words, if the number is positive, 0 is used, and if the number is negative, 1 is used.

What happens when int overflows C++?

An integer overflow can cause the value to wrap and become negative, which violates the program's assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two's complement of 128).


1 Answers

Does it mean the unique value that can be represented by the resulting data type

Yes. The set of numbers congruent to E1×2^E2 modulo 2^N is infinite, but there is only one value in any interval of size 2^N, therefore there is only one value representable in an integer type of width N.

If we look in the "p0907R1 Signed Integers are Two’s Complement" proposal we find a similar phrase with "unique representation" which makes this more clear:

Conversion from signed to unsigned is always well-defined: the result is the unique value of the destination type that is congruent to the source integer modulo 2N.

Then, I suppose the result is defined as -2. Is this interpretation correct?

Yes

On x64 the equivalent asm instruction is shlx (logical shift left)

I suppose the change is related to the mandatory 2-complement representation of negative signed integers.

Correct. As was the case with unsigned types, now also signed types they mathematically represent equivalence classes (well, it's not clear to me how much this is true as it looks like they want to still keep some UB cases on overflow).

like image 148
bolov Avatar answered Sep 28 '22 05:09

bolov