Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do promotion rules work when the signedness on either side of a binary operator differ? [duplicate]

Consider the following programs:

// http://ideone.com/4I0dT #include <limits> #include <iostream>  int main() {     int max = std::numeric_limits<int>::max();     unsigned int one = 1;     unsigned int result = max + one;     std::cout << result; } 

and

// http://ideone.com/UBuFZ #include <limits> #include <iostream>  int main() {     unsigned int us = 42;     int neg = -43;     int result = us + neg;     std::cout << result; } 

How does the + operator "know" which is the correct type to return? The general rule is to convert all of the arguments to the widest type, but here there's no clear "winner" between int and unsigned int. In the first case, unsigned int must be being chosen as the result of operator+, because I get a result of 2147483648. In the second case, it must be choosing int, because I get a result of -1. Yet I don't see in the general case how this is decidable. Is this undefined behavior I'm seeing or something else?

like image 884
Billy ONeal Avatar asked Jul 21 '11 01:07

Billy ONeal


2 Answers

This is outlined explicitly in §5/9:

Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

  • If either operand is of type long double, the other shall be converted to long double.
  • Otherwise, if either operand is double, the other shall be converted to double.
  • Otherwise, if either operand is float, the other shall be converted to float.
  • Otherwise, the integral promotions shall be performed on both operands.
  • Then, if either operand is unsigned long the other shall be converted to unsigned long.
  • Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int.
  • Otherwise, if either operand is long, the other shall be converted to long.
  • Otherwise, if either operand is unsigned, the other shall be converted to unsigned.

[Note: otherwise, the only remaining case is that both operands are int]

In both of your scenarios, the result of operator+ is unsigned. Consequently, the second scenario is effectively:

int result = static_cast<int>(us + static_cast<unsigned>(neg)); 

Because in this case the value of us + neg is not representable by int, the value of result is implementation-defined – §4.7/3:

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

like image 145
ildjarn Avatar answered Oct 19 '22 22:10

ildjarn


Before C was standardized, there were differences between compilers -- some followed "value preserving" rules, and others "sign preserving" rules. Sign preserving meant that if either operand was unsigned, the result was unsigned. This was simple, but at times gave rather surprising results (especially when a negative number was converted to an unsigned).

C standardized on the rather more complex "value preserving" rules. Under the value preserving rules, promotion can/does depend on the actual ranges of the types, so you can get different results on different compilers. For example, on most MS-DOS compilers, int is the same size as short and long is different from either. On many current systems int is the same size as long, and short is different from either. With value preserving rules, these can lead to the promoted type being different between the two.

The basic idea of value preserving rules is that it'll promote to a larger signed type if that can represent all the values of the smaller type. For example, a 16-bit unsigned short can be promoted to a 32-bit signed int, because every possible value of unsigned short can be represented as a signed int. The types will be promoted to an unsigned type if and only if that's necessary to preserve the values of the smaller type (e.g., if unsigned short and signed int are both 16 bits, then a signed int can't represent all possible values of unsigned short, so an unsigned short will be promoted to unsigned int).

When you assign the result as you have, the result will get converted to the destination type anyway, so most of this makes relatively little difference -- at least in most typical cases, where it'll just copy the bits into the result, and it's up to you to decide whether to interpret that as signed or unsigned.

When you don't assign the result such as in a comparison, things can get pretty ugly though. For example:

unsigned int a = 5; signed int b = -5;  if (a > b)     printf("Of course"); else     printf("What!"); 

Under sign preserving rules, b would be promoted to unsigned, and in the process become equal to UINT_MAX - 4, so the "What!" leg of the if would be taken. With value preserving rules, you can manage to produce some strange results a bit like this as well, but 1) primarily on the DOS-like systems where int is the same size as short, and 2) it's generally harder to do it anyway.

like image 21
Jerry Coffin Avatar answered Oct 19 '22 23:10

Jerry Coffin