Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is left-shifting (<<) a negative integer undefined behavior in C++11?

Is left-shifting a negative int Undefined Behavior in C++11?

The relevant Standard passages here are from 5.8:

2/The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

The part that confuses me is:

Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Should this be interpreted to mean that left-shifting any negative number is UB? Or does it only mean if you LS a negative and the result doesn't fit in the result type, then it's UB?

Moreover, the preceding clause says:

1/The shift operators << and >> group left-to-right. shift-expression: additive-expression shift-expression << additive-expression shift-expression >> additive-expression

The operands shall be of integral or unscoped enumeration type and integral promotions are performed.

The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.

This makes it explicit that using a negative number for one of the operands is UB. If it were UB to use a negative for the other operand, I would expect that to be made clear here as well.

So, bottom line, is:

-1 << 1

Undefined Behavior?


@Angew provided a psudocode interpretation of the Standardese which succinctly expresses one possible (likely) valid interpretation. Others have questioned whether this question is really about the applicability of the language "behavior is undefined" versus our (StackOverflow's) use of the phrase "Undefined Behavior." This edit is to provide some more clarification on what I'm trying to ask.

@Angew's interpretation of the Standardese is:

if (typeof(E1) == unsigned integral)
  value = E1 * 2^E2 % blah blah;
else if (typeof(E1) == signed integral && E1 >= 0 && representable(E1 * 2^E2))
  value = E1 * 2^E2;
else
  value = undefined;

What this question really boils down to is this -- is the correct interpretation actually:

value = E1 left-shift-by (E2)

switch (typeof(E1))
{
case unsigned integral :
  value = E1 * 2^E2 % blah blah;
  break;

case signed integral :
  if (E1 >= 0)
  { 
    if (representable(E1 * 2^E2))
    {
      value = E1 * 2^E2;
    }
    else
    {
      value = undefined;
    }
  }
  break;
}

?

Sidenote, in looking at this in terms of psudocode makes it fairly clear in my mind that @Agnew's interpretation is the correct one.

like image 609
John Dibling Avatar asked Oct 25 '13 15:10

John Dibling


People also ask

Can you left shift by a negative number in C?

The left shift and right shift operators should not be used for negative numbers. The result of is undefined behaviour if any of the operands is a negative number.

What happens when you bit shift a negative number?

Right Shifts For signed numbers, the sign bit is used to fill the vacated bit positions. In other words, if the number is positive, 0 is used, and if the number is negative, 1 is used. The result of a right-shift of a signed negative number is implementation-dependent.


3 Answers

Yes, I would say it's undefined. If we translate the standardese to pseudo-code:

if (typeof(E1) == unsigned integral)
  value = E1 * 2^E2 % blah blah;
else if (typeof(E1) == signed integral && E1 >= 0 && representable(E1 * 2^E2))
  value = E1 * 2^E2;
else
  value = undefined;

I'd say the reason why they're explicit about the right-hand operand and not about the left-hand one is that the paragrpah you quote (the one with the right-hand operand case) applies to both left and right shifts.

For the left-hand operand, the ruling differs. Left-shifting a negative is undefined, right-shifting it is implementation-defined.

like image 157
Angew is no longer proud of SO Avatar answered Oct 17 '22 15:10

Angew is no longer proud of SO


Should this be interpreted to mean that left-shifting any negative number is UB?

Yes, the behavior is undefined when given any negative number. The behavior is only defined when both of the following are true:

  • the number is non-negative
  • E1 × 2E2 is representable in the result type

That's literally what "if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined," is saying:

if X and Y
  then Z
else U
like image 44
bames53 Avatar answered Oct 17 '22 14:10

bames53


Answer as per the Question:

The question really is:

Can we equate the term "behavior is undefined" equates exactly to the term "Undefined Behavior".

As it is currently worded it means "Undefined Behavior."

Personal comment about the situation

But I am not convinced that is the authors intention.
If it is the authors intention, then we should probably have a note explaining why. But I am more inclined to believe the author meant that the result of that operation is undefined because the representation of negative numbers is not explicitly defined by the standard. If the representation of negative numbers is not explicitly defined for negatives, then moving the bits around would lead to an undefined value.

Either way, the wording (or explanation) needs to be tightened/expanded to make it less ambiguous.

like image 36
Martin York Avatar answered Oct 17 '22 13:10

Martin York