In Java the int type is signed, but it has a method that compares two ints as if they were unsigned: <pre class="prettyprint lang-java prettyprint-override"><code>public static int compareUnsigned(int x, int y) { return compare(x + MIN_VALUE, y + MIN_VALUE); } </code></pre> It adds <code>Integer.MIN_VALUE</code> to each argument, then calls the normal signed comparison method, which is: <pre class="prettyprint lang-java prettyprint-override"><code>public static int compare(int x, int y) { return (x < y) ? -1 : ((x == y) ? 0 : 1); } </code></pre> How does adding <code>MIN_VALUE</code> to each argument magically make the comparison unsigned?

This technique works for any size of integer, but I'll use an 8-bit byte-sized integer to explain, because the numbers are smaller and easier to work with. An 8-bit type has 28 = 256 possible values. At a low level these are just bits, and signed vs unsigned is a matter of how we interpret those bits. When interpreted as an unsigned integer, they have a range of 0 to 255. When interpreted as a signed two's complement integer, they have a range of −128 to +127. The number line for the types looks like this: &ensp;<img src="https://i.stack.imgur.com/0SQVh.png" alt=""> Notice that the positive numbers from 0 to 127 can be represented by both signed and unsigned types, and they are represented by exactly the same bit patterns (00000000 to 01111111). The bit patterns which represent the large positive numbers from 128 to 255 in the unsigned interpretation are reused for the numbers −128 to −1 in the signed interpretation. It is as if someone took the unsigned number line, chopped off the upper half of the range, and glued it on at the lower end of the line. Now, let's look at what happens when we compare a pair of integers. <h3>Case 1: Both values are in the signed positive range</h3> With both values in the range 0 to 127, they have the same numeric value whether the bits are interpreted as signed or unsigned. We unconditionally add MIN_VALUE to each value. MIN_VALUE for our signed byte type is −128, so adding that means we are actually subtracting 128. An example: our comparison function, using signed types, is given x = 20 and y = 60. Adding MIN_VALUE, we get x' = 20 − 128 = −108 and y' = 60 − 128 = −68: &ensp;<img src="https://i.stack.imgur.com/L8jaO.png" alt=""> Adding MIN_VALUE to a positive value will always map it to a negative value. At the extreme ends of the range, 0 would become −128, and 127 would become −1. The operation will not change the order of x and y relative to each other, so the result of any comparison between x' and y' will be the same as if we had not added MIN_VALUE, which is correct. <h3>Case 2: Both values are in the signed negative range</h3> In this case, both values are in the range −128 to −1 if interpreted as signed. If interpreted as unsigned they are in the range 128 to 255 (which is 256 greater). When we unconditionally add MIN_VALUE to each of our signed negative values, it always causes overflow and wrap-around, to signed positive values. Numerically, this wrap-around is the same as adding 256. If we are given x = −35 and y = −80 to compare, we get x' = −35 − 128 + 256 = 93 and y' = −80 − 128 + 256 = 48: &ensp;<img src="https://i.stack.imgur.com/HoFOr.png" alt=""> We can also visualize this with the unsigned interpretations of −35 and −80, which are 221 and 176. When subtracting 128, we get exactly the same results for x' and y'. One of the advantages of two's complement is that addition and subtraction give the same results regardless of whether you treat the data as signed or unsigned, so CPUs can use the same circuitry. As in case 1, the operation does not change the results of any comparisons between the two numbers. Our x was greater than y (being of lesser negative magnitude), and x' is also greater than y'. So comparisons between these inputs will be correct. <h3>Case 3: One value is in the signed positive range, the other negative</h3> This is the interesting case. Notice that when we add MIN_VALUE, it always changes a number's sign. Positive values are mapped to negative values and negative values are mapped to positive values. Let's compare x = −35 and y = 60. Since we want these to be compared as unsigned, we really intend x to be interpreted as −35 + 256 = 221. So x needs to be interpreted as greater than y, even though our signed data type will not normally do this. Because the numbers have opposite signs, the MIN_VALUE operation which changes the signs will reverse the numbers' order on the number line. x' = −35 − 128 + 256 = 93, and y' = 60 − 128 = −68. So we get x' is greater than y', which is what we wanted: &ensp;<img src="https://i.stack.imgur.com/vx1kp.png" alt=""> <h3>Generalization</h3> Since we've handled all combinations of positive and negative, we know the technique works for all possible values. In the case of 32-bit ints, the ranges are bigger (signed range is −2,147,483,648 (MIN_VALUE) to +2,147,483,647, and unsigned range is 0 to 4,294,967,295) but it works just the same. In fact it works for every size of integer, and in every programming language, provided that: <ol> <li>The signed integers use two's complement representation (which is nearly universal).</li> <li>Addition wraps around on overflow (rather than raising an error or promoting to a bigger number type or being undefined).</li> </ol> You can also do the reverse: if you have only an unsigned integer type, and you want to do a two's complement signed comparison, add (the unsigned interpretation of) the signed minimum value to each number. Because the technique is just two unconditional addition operations, it is extremely efficient even if not treated specially by a compiler or VM.

How does adding MIN_VALUE compare integers as unsigned?

Tags:

language-agnostic

comparison

integer

unsigned

signed

In Java the int type is signed, but it has a method that compares two ints as if they were unsigned:

public static int compareUnsigned(int x, int y) {
    return compare(x + MIN_VALUE, y + MIN_VALUE);
}

It adds Integer.MIN_VALUE to each argument, then calls the normal signed comparison method, which is:

public static int compare(int x, int y) {
    return (x < y) ? -1 : ((x == y) ? 0 : 1);
}

How does adding MIN_VALUE to each argument magically make the comparison unsigned?

316

asked Dec 17 '14 14:12

Boann

1 Answers

This technique works for any size of integer, but I'll use an 8-bit byte-sized integer to explain, because the numbers are smaller and easier to work with.

An 8-bit type has 2⁸ = 256 possible values. At a low level these are just bits, and signed vs unsigned is a matter of how we interpret those bits. When interpreted as an unsigned integer, they have a range of 0 to 255. When interpreted as a signed two's complement integer, they have a range of −128 to +127.

The number line for the types looks like this:

Notice that the positive numbers from 0 to 127 can be represented by both signed and unsigned types, and they are represented by exactly the same bit patterns (00000000 to 01111111).

The bit patterns which represent the large positive numbers from 128 to 255 in the unsigned interpretation are reused for the numbers −128 to −1 in the signed interpretation. It is as if someone took the unsigned number line, chopped off the upper half of the range, and glued it on at the lower end of the line.

Now, let's look at what happens when we compare a pair of integers.

Case 1: Both values are in the signed positive range

With both values in the range 0 to 127, they have the same numeric value whether the bits are interpreted as signed or unsigned.

We unconditionally add MIN_VALUE to each value. MIN_VALUE for our signed byte type is −128, so adding that means we are actually subtracting 128.

An example: our comparison function, using signed types, is given x = 20 and y = 60. Adding MIN_VALUE, we get x' = 20 − 128 = −108 and y' = 60 − 128 = −68:

Adding MIN_VALUE to a positive value will always map it to a negative value. At the extreme ends of the range, 0 would become −128, and 127 would become −1. The operation will not change the order of x and y relative to each other, so the result of any comparison between x' and y' will be the same as if we had not added MIN_VALUE, which is correct.

Case 2: Both values are in the signed negative range

In this case, both values are in the range −128 to −1 if interpreted as signed. If interpreted as unsigned they are in the range 128 to 255 (which is 256 greater).

When we unconditionally add MIN_VALUE to each of our signed negative values, it always causes overflow and wrap-around, to signed positive values. Numerically, this wrap-around is the same as adding 256. If we are given x = −35 and y = −80 to compare, we get x' = −35 − 128 + 256 = 93 and y' = −80 − 128 + 256 = 48:

We can also visualize this with the unsigned interpretations of −35 and −80, which are 221 and 176. When subtracting 128, we get exactly the same results for x' and y'. One of the advantages of two's complement is that addition and subtraction give the same results regardless of whether you treat the data as signed or unsigned, so CPUs can use the same circuitry.

As in case 1, the operation does not change the results of any comparisons between the two numbers. Our x was greater than y (being of lesser negative magnitude), and x' is also greater than y'. So comparisons between these inputs will be correct.

Case 3: One value is in the signed positive range, the other negative

This is the interesting case. Notice that when we add MIN_VALUE, it always changes a number's sign. Positive values are mapped to negative values and negative values are mapped to positive values.

Let's compare x = −35 and y = 60. Since we want these to be compared as unsigned, we really intend x to be interpreted as −35 + 256 = 221. So x needs to be interpreted as greater than y, even though our signed data type will not normally do this.

Because the numbers have opposite signs, the MIN_VALUE operation which changes the signs will reverse the numbers' order on the number line. x' = −35 − 128 + 256 = 93, and y' = 60 − 128 = −68. So we get x' is greater than y', which is what we wanted:

Generalization

Since we've handled all combinations of positive and negative, we know the technique works for all possible values.

In the case of 32-bit ints, the ranges are bigger (signed range is −2,147,483,648 (MIN_VALUE) to +2,147,483,647, and unsigned range is 0 to 4,294,967,295) but it works just the same. In fact it works for every size of integer, and in every programming language, provided that:

The signed integers use two's complement representation (which is nearly universal).
Addition wraps around on overflow (rather than raising an error or promoting to a bigger number type or being undefined).

You can also do the reverse: if you have only an unsigned integer type, and you want to do a two's complement signed comparison, add (the unsigned interpretation of) the signed minimum value to each number.

Because the technique is just two unconditional addition operations, it is extremely efficient even if not treated specially by a compiler or VM.

answered Oct 11 '22 12:10

Boann

Related questions
                            
                                What techniques exist for the software-driven locomotion of a bipedal robot?
                            
                                How to emulate laziness
                            
                                Proof of correct of the dynamic programming approach to min edit distance
                            
                                What algorithm can you use to find duplicate phrases in a string?
                            
                                What happens during Stand-By and Hibernation?
                            
                                A naming convention for the column intended exclusively for ordering
                            
                                Simple spell checking algorithm
                            
                                Find the Ninja Index of an array
                            
                                Maximizing difference between numbers in a sequence
                            
                                Does it make sense to store a SQLite database in version control?
                            
                                How to cluster objects (without coordinates)
                            
                                What is this pattern called (helps avoid type casting)?
                            
                                What language features are required in a programming language to make a compiler?
                            
                                Reading hexadecimal values in English
                            
                                Are 0 bytes files really 0 bytes?
                            
                                Is the DI pattern limiting wrt expensive object creation coupled with infrequent dependency usage?
                            
                                How is the modulo operator (%) actually computed?
                            
                                Finding number of concurrent events given start and end times
                            
                                How to store a polynomial?
                            
                                What is the cellular automaton shown as loading screen on Wolfram Alpha?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does adding MIN_VALUE compare integers as unsigned?

Tags:

language-agnostic

comparison

integer

unsigned

signed

Boann

People also ask

1 Answers

Case 1: Both values are in the signed positive range

Case 2: Both values are in the signed negative range

Case 3: One value is in the signed positive range, the other negative

Generalization

Boann

Recent Activity

Donate For Us