Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why was 1 << 31 changed to be implementation-defined in C++14?

In all versions of C and C++ prior to 2014, writing

1 << (CHAR_BIT * sizeof(int) - 1) 

caused undefined behaviour, because left-shifting is defined as being equivalent to successive multiplication by 2, and this shift causes signed integer overflow:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

However in C++14 the text has changed for << but not for multiplication:

The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. [...] Otherwise, if E1 has a signed type and non-negative value, and E1 × 2E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.

The behaviour is now the same as for out-of-range assignment to signed type, i.e. as covered by [conv.integral]/3:

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

This means it's still non-portable to write 1 << 31 (on a system with 32-bit int). So why was this change made in C++14?

like image 649
M.M Avatar asked Oct 12 '14 23:10

M.M


1 Answers

The relevant issue is CWG 1457, where the justification is that the change allows 1 << 31 to be used in constant expressions:

The current wording of 5.8 [expr.shift] paragraph 2 makes it undefined behavior to create the most-negative integer of a given type by left-shifting a (signed) 1 into the sign bit, even though this is not uncommonly done and works correctly on the majority of (twos-complement) architectures:

...if E1 has a signed type and non-negative value, and E1 * 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

As a result, this technique cannot be used in a constant expression, which will break a significant amount of code.

Constant expressions can't contain undefined behavior, which means that using an expression containing UB in a context requiring a constant expression makes the program ill-formed. libstdc++'s numeric_limits::min, for example, once failed to compile in clang due to this.

like image 54
T.C. Avatar answered Sep 22 '22 16:09

T.C.