In the current C++ Standard Draft, the left shift operator is defined as follows [expr.shift]:
The value of
E1 << E2
is the unique value congruent toE1×2^E2
modulo2^N
, whereN
is the width of the type of the result.
Consider int E1 = 2^31-1 = 2'147'483'647
, E2 = 1
, and int
having 32 bits. Then there is an infinite number of numbers congruent to E1×2^E2 = 4'294'967'294
modulo 2^N = 2^32
, namely, all the numbers 4'294'967'294 + k×2^32
where k
is an arbitrary integer. Examples are 4'294'967'294
(k=0
) or -2
(k=-1
).
I don't understand what the Standard means by the unique value out of these numbers. Does it mean the unique value that can be represented by the resulting data type? Then, I suppose the result is defined as -2
. Is this interpretation correct?
Until C++20, the definition was different and this case would cause undefined behavior. I suppose the change is related to the mandatory 2's-complement representation of negative signed integers.
In fact, there is now no more requirement for E1
to be non-negative. It therefore seems that -1 << 1
is defined as -2
. Is that right as well?
In contrast, the C standard says that signed integer overflow leads to undefined behavior where a program can do anything, including dumping core or overrunning a buffer. The misbehavior can even precede the overflow. Such an overflow can occur during addition, subtraction, multiplication, division, and left shift.
Use 64-bits integers. One very good way to prevent integer overflows is to use int64_t to implement integers. In most case, 64-bits ints will not commit overflow, unlike their 32-bits counterparts. There is actually very few downsides in using int64_t instead of int32_t .
For signed numbers, the sign bit is used to fill the vacated bit positions. In other words, if the number is positive, 0 is used, and if the number is negative, 1 is used.
An integer overflow can cause the value to wrap and become negative, which violates the program's assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two's complement of 128).
Does it mean the unique value that can be represented by the resulting data type
Yes. The set of numbers congruent to E1×2^E2
modulo 2^N
is infinite, but there is only one value in any interval of size 2^N
, therefore there is only one value representable in an integer type of width N
.
If we look in the "p0907R1 Signed Integers are Two’s Complement" proposal we find a similar phrase with "unique representation" which makes this more clear:
Conversion from signed to unsigned is always well-defined: the result is the unique value of the destination type that is congruent to the source integer modulo 2N.
Then, I suppose the result is defined as
-2
. Is this interpretation correct?
Yes
On x64 the equivalent asm instruction is shlx
(logical shift left)
I suppose the change is related to the mandatory 2-complement representation of negative signed integers.
Correct. As was the case with unsigned types, now also signed types they mathematically represent equivalence classes (well, it's not clear to me how much this is true as it looks like they want to still keep some UB cases on overflow).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With