Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the sign different after subtracting unsigned and signed?

unsigned int t = 10;
int d = 16;
float c = t - d;
int e = t - d;

Why is the value of c positive but e negative?

like image 574
Eugene Kolombet Avatar asked Oct 11 '18 07:10

Eugene Kolombet


People also ask

What happens when you subtract an unsigned int from a signed int?

An int is signed by default, meaning it can represent both positive and negative values. An unsigned is an integer that can never be negative. If you take an unsigned 0 and subtract 1 from it, the result wraps around, leaving a very large number (2^32-1 with the typical 32-bit integer size).

Can you do unsigned subtraction?

The subtraction of two n-digit unsigned numbers M - N (N * 0) in base r can be done as follows: 1. Add the minuend M to the r's complement of the subtrahend N. This performs M + (r' - N) = M - N + r'.

How does unsigned subtraction work?

By casting the two numbers to unsigned, we ensure that if b > a, the difference between the two is going to be a large unsigned number and have it's highest bit set. When translating this large unsigned number into its signed counterpart we will always get a negative number due to the set MSB.

Can unsigned subtraction overflow?

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.


2 Answers

Let's start by analysing the result of t - d.

t is an unsigned int while d is an int, so to do arithmetic on them, the value of d is converted to an unsigned int (C++ rules say unsigned gets preference here). So we get 10u - 16u, which (assuming 32-bit int) wraps around to 4294967290u.

This value is then converted to float in the first declaration, and to int in the second one.

Assuming the typical implementation of float (32-bit single-precision IEEE), its highest representable value is roughly 1e38, so 4294967290u is well within that range. There will be rounding errors, but the conversion to float won't overflow.

For int, the situation's different. 4294967290u is too big to fit into an int, so wrap-around happens and we arrive back at the value -6. Note that such wrap-around is not guaranteed by the standard: the resulting value in this case is implementation-defined(1), which means it's up to the compiler what the result value is, but it must be documented.


(1) C++17 (N4659), [conv.integral] 7.8/3:

If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.

like image 70
Angew is no longer proud of SO Avatar answered Nov 05 '22 14:11

Angew is no longer proud of SO


First, you have to understand "usual arithmetic conversions" (that link is for C, but the rules are the same in C++). In C++, if you do arithmetic with mixed types (you should avoid that when possible, by the way), there's a set of rules that decides which type the calculation is done in.

In your case, you are subtracting a signed int from an unsigned int. The promotion rules say that the actual calculation is done using unsigned int.

So your calculation is 10 - 16 in unsigned int arithmetic. Unsigned arithmetic is modulo arithmetic, meaning that it wraps around. So, assuming your typical 32-bit int, the result of this calculation is 2^32 - 6.

This is the same for both lines. Note that the subtraction is completely independent from the assignment; the type on the left side has absolutely no influence on how the calculation happens. It is a common beginner mistake to think that the type on the left side somehow influences the calculation; but float f = 5 / 6 is zero, because the division still uses integer arithmetic.

The difference, then, is what happens during the assignment. The result of the subtraction is implicitly converted to float in one case, and int in the other.

The conversion to float tries to find the closest value to the actual one that the type can represent. This will be some very large value; not quite the one the original subtraction yielded though.

The conversion to int says that if the value fits into the range of int, the value will be unchanged. But 2^32 - 6 is far larger than the 2^31 - 1 that a 32-bit int can hold, so you get the other part of the conversion rule, which says that the resulting value is implementation-defined. This is a term in the standard that means "different compilers can do different things, but they have to document what they do".

For all practical purposes, all compilers that you'll likely encounter say that the bit pattern stays the same and is just interpreted as signed. Because of the way 2's complement arithmetic works (the way that almost all computers represent negative numbers), the result is the -6 you would expect from the calculation.

But all this is a very long way of repeating the first point, which is "don't do mixed type arithmetic". Cast the types first, explicitly, to types that you know will do the right thing.

like image 31
Sebastian Redl Avatar answered Nov 05 '22 14:11

Sebastian Redl