Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between if (a - b < 0) and if (a < b)

a < b and a - b < 0 can mean two different things. Consider the following code:

int a = Integer.MAX_VALUE;
int b = Integer.MIN_VALUE;
if (a < b) {
    System.out.println("a < b");
}
if (a - b < 0) {
    System.out.println("a - b < 0");
}

When run, this will only print a - b < 0. What happens is that a < b is clearly false, but a - b overflows and becomes -1, which is negative.

Now, having said that, consider that the array has a length that is really close to Integer.MAX_VALUE. The code in ArrayList goes like this:

int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
    newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
    newCapacity = hugeCapacity(minCapacity);

oldCapacity is really close to Integer.MAX_VALUE so newCapacity (which is oldCapacity + 0.5 * oldCapacity) might overflow and become Integer.MIN_VALUE (i.e. negative). Then, subtracting minCapacity underflows back into a positive number.

This check ensures that the if is not executed. If the code were written as if (newCapacity < minCapacity), it would be true in this case (since newCapacity is negative) so the newCapacity would be forced to minCapacity regardless of the oldCapacity.

This overflow case is handled by the next if. When newCapacity has overflowed, this will be true: MAX_ARRAY_SIZE is defined as Integer.MAX_VALUE - 8 and Integer.MIN_VALUE - (Integer.MAX_VALUE - 8) > 0 is true. The newCapacity is therefore rightly handled: hugeCapacity method returns MAX_ARRAY_SIZE or Integer.MAX_VALUE.

NB: this is what the // overflow-conscious code comment in this method is saying.


I found this explanation:

On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern wrote:

I did a quick search and it appears that Java is indeed two's complement based. Nonetheless, please allow me to point out that, in general, this type of code worries me since I fully expect that at some point someone will come along and do exactly what Dmytro suggested; that is, someone will change:

if (a - b > 0)

to

if (a > b)

and the entire ship will sink. I, personally, like to avoid obscurities such as making integer overflow an essential basis for my algorithm unless there is a good reason to do so. I would, in general, prefer to avoid overflow altogether and to make the overflow scenario more explicit:

if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
   // Do something
} else {
  // Do something else
}

It's a good point.

In ArrayList we cannot do this (or at least not compatibly), because ensureCapacity is a public API and effectively already accepts negative numbers as requests for a positive capacity that cannot be satisfied.

The current API is used like this:

int newcount = count + len;
ensureCapacity(newcount);

If you want to avoid overflow, you would need to change to something less natural like

ensureCapacity(count, len);
int newcount = count + len;

Anyway, I'm keeping the overflow-conscious code, but adding more warning comments, and "out-lining" huge array creation so that ArrayList's code now looks like:

/**
 * Increases the capacity of this <tt>ArrayList</tt> instance, if
 * necessary, to ensure that it can hold at least the number of elements
 * specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
public void ensureCapacity(int minCapacity) {
    modCount++;

    // Overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/**
 * The maximum size of array to allocate.
 * Some VMs reserve some header words in an array.
 * Attempts to allocate larger arrays may result in
 * OutOfMemoryError: Requested array size exceeds VM limit
 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // Overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);

    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

private int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

Webrev regenerated.

Martin

In Java 6, if you use the API as:

int newcount = count + len;
ensureCapacity(newcount);

And newCount overflows (this becomes negative), if (minCapacity > oldCapacity) will return false and you may mistakenly assume that the ArrayList was increased by len.


Looking at the code:

int newCapacity = oldCapacity + (oldCapacity >> 1);

If oldCapacity is quite large, this will overflow, and newCapacity will be a negative number. A comparison like newCapacity < oldCapacity will incorrectly evaluate true and the ArrayList will fail to grow.

Instead, the code as written (newCapacity - minCapacity < 0 returns false) will allow the negative value of newCapacity to be further evaluated in the next line, resulting in recalculating newCapacity by invoking hugeCapacity (newCapacity = hugeCapacity(minCapacity);) to allow for the ArrayList to grow up to MAX_ARRAY_SIZE.

This is what the // overflow-conscious code comment is trying to communicate, though rather obliquely.

So, bottom line, the new comparison protects against allocating an ArrayList larger than the predefined MAX_ARRAY_SIZE while allowing it to grow right up to that limit if needed.


The two forms behave exactly the same unless the expression a - b overflows, in which case they are opposite. If a is a large negative, and b is a large positive, then (a < b) is clearly true, but a - b will overflow to become positive, so (a - b < 0) is false.

If you're familiar with x86 assembly code, consider that (a < b) is implemented by a jge, which branches around the body of the if statement when SF = OF. On the other hand, (a - b < 0) will act like a jns, which branches when SF = 0. Hence, these behave differently precisely when OF = 1.