Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does adding MIN_VALUE compare integers as unsigned?

In Java the int type is signed, but it has a method that compares two ints as if they were unsigned:

public static int compareUnsigned(int x, int y) {
    return compare(x + MIN_VALUE, y + MIN_VALUE);
}

It adds Integer.MIN_VALUE to each argument, then calls the normal signed comparison method, which is:

public static int compare(int x, int y) {
    return (x < y) ? -1 : ((x == y) ? 0 : 1);
}

How does adding MIN_VALUE to each argument magically make the comparison unsigned?

like image 316
Boann Avatar asked Dec 17 '14 14:12

Boann


People also ask

What does integer Min_value do?

The Integer. MIN_VALUE is a constant in the Integer class that represents the minimum or least integer value that can be represented in 32 bits, which is -2147483648, -231. This is the lowest value that any integer variable in Java can hold.

Can int and unsigned compare?

In your case the "common" type is unsigned int . This means that int operand (your b ) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction. When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX ).

What is the result of addition 1 to max value of integer?

Whenever you add 1 to the largest java Integer, which has a bit sign of 0(Zero), then its bit sign becomes 1 and the number becomes negative. If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format.

What does integer Max_value return?

MAX_VALUE is a number in the Java Integer сlass of java. lang package. It is the maximum possible Integer number that can be represented in 32 bits. Its exact value is 2147483647 i.e. 231-1.


1 Answers

This technique works for any size of integer, but I'll use an 8-bit byte-sized integer to explain, because the numbers are smaller and easier to work with.

An 8-bit type has 28 = 256 possible values. At a low level these are just bits, and signed vs unsigned is a matter of how we interpret those bits. When interpreted as an unsigned integer, they have a range of 0 to 255. When interpreted as a signed two's complement integer, they have a range of −128 to +127.

The number line for the types looks like this:

Notice that the positive numbers from 0 to 127 can be represented by both signed and unsigned types, and they are represented by exactly the same bit patterns (00000000 to 01111111).

The bit patterns which represent the large positive numbers from 128 to 255 in the unsigned interpretation are reused for the numbers −128 to −1 in the signed interpretation. It is as if someone took the unsigned number line, chopped off the upper half of the range, and glued it on at the lower end of the line.

Now, let's look at what happens when we compare a pair of integers.

Case 1: Both values are in the signed positive range

With both values in the range 0 to 127, they have the same numeric value whether the bits are interpreted as signed or unsigned.

We unconditionally add MIN_VALUE to each value. MIN_VALUE for our signed byte type is −128, so adding that means we are actually subtracting 128.

An example: our comparison function, using signed types, is given x = 20 and y = 60. Adding MIN_VALUE, we get x' = 20 − 128 = −108 and y' = 60 − 128 = −68:

Adding MIN_VALUE to a positive value will always map it to a negative value. At the extreme ends of the range, 0 would become −128, and 127 would become −1. The operation will not change the order of x and y relative to each other, so the result of any comparison between x' and y' will be the same as if we had not added MIN_VALUE, which is correct.

Case 2: Both values are in the signed negative range

In this case, both values are in the range −128 to −1 if interpreted as signed. If interpreted as unsigned they are in the range 128 to 255 (which is 256 greater).

When we unconditionally add MIN_VALUE to each of our signed negative values, it always causes overflow and wrap-around, to signed positive values. Numerically, this wrap-around is the same as adding 256. If we are given x = −35 and y = −80 to compare, we get x' = −35 − 128 + 256 = 93 and y' = −80 − 128 + 256 = 48:

We can also visualize this with the unsigned interpretations of −35 and −80, which are 221 and 176. When subtracting 128, we get exactly the same results for x' and y'. One of the advantages of two's complement is that addition and subtraction give the same results regardless of whether you treat the data as signed or unsigned, so CPUs can use the same circuitry.

As in case 1, the operation does not change the results of any comparisons between the two numbers. Our x was greater than y (being of lesser negative magnitude), and x' is also greater than y'. So comparisons between these inputs will be correct.

Case 3: One value is in the signed positive range, the other negative

This is the interesting case. Notice that when we add MIN_VALUE, it always changes a number's sign. Positive values are mapped to negative values and negative values are mapped to positive values.

Let's compare x = −35 and y = 60. Since we want these to be compared as unsigned, we really intend x to be interpreted as −35 + 256 = 221. So x needs to be interpreted as greater than y, even though our signed data type will not normally do this.

Because the numbers have opposite signs, the MIN_VALUE operation which changes the signs will reverse the numbers' order on the number line. x' = −35 − 128 + 256 = 93, and y' = 60 − 128 = −68. So we get x' is greater than y', which is what we wanted:

Generalization

Since we've handled all combinations of positive and negative, we know the technique works for all possible values.

In the case of 32-bit ints, the ranges are bigger (signed range is −2,147,483,648 (MIN_VALUE) to +2,147,483,647, and unsigned range is 0 to 4,294,967,295) but it works just the same. In fact it works for every size of integer, and in every programming language, provided that:

  1. The signed integers use two's complement representation (which is nearly universal).

  2. Addition wraps around on overflow (rather than raising an error or promoting to a bigger number type or being undefined).

You can also do the reverse: if you have only an unsigned integer type, and you want to do a two's complement signed comparison, add (the unsigned interpretation of) the signed minimum value to each number.

Because the technique is just two unconditional addition operations, it is extremely efficient even if not treated specially by a compiler or VM.

like image 74
Boann Avatar answered Oct 11 '22 12:10

Boann