Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of ArrayList

I was trying to see the performance difference between pre-initializing an ArrayList to a given capacity vs going with the default capacity and expanding on requirement. Just out of curiosity. I found that the default capacity array code is ~10% FASTER than the code which initializes the array to the required capacity. Here is the code I used:

public class Test {
    public static void main(String[] args) {

        long t1 = System.currentTimeMillis();
        for(int j=0;j<10;++j) {
            List<Integer> list = new ArrayList<Integer>();
            for(int i=0;i<1000000;++i) {
                list.add(i);
            }
        }
        long t2 = System.currentTimeMillis();
        System.out.println("Time taken: " + (t2-t1)/10.0);
    }
}

I consistently get ~77 ms for this version on my box, whereas I get ~85 ms if I change the List initialization to new ArrayList<Integer>(1000000). Why is it so? Shouldn't it be the other way round? In fact, the List without pre-init is consistently a tiny bit (~0.5-1 ms) faster than using plain Integer[]. Basically, what it says is default capacity arraylist > simple array > pre-capacity-ensured arraylist in insert performance.

This is very strange to me. My initial guess is that it has something to do with memory allocation, something like giving 1000000 int blocks in one go is maybe slower than slowly getting more space? Is this even reproducible in other machines? I am using jdk 1.6.0, Mac OS X, running through eclipse.

I tried it in two other environments: --> Tried running java+javac from command line instead of eclipse - Here I get pre-capacity-ensured arraylist > simple array > default capacity arraylist, consistently.

--> Tried running java+javac on my linux (RHEL) desktop. This box has 24 GB RAM, whereas my laptop had only 8 GB. Here, I get plain array >>> default capacity arraylist > pre-capacity-ensured arraylist. Plain array is super-fast, about 2-3x faster in this case than the other two.

EDIT: Following @JonSkeet's suggestion in the comments, I used nanoTime(), and Integer instead of int. It still doesn't adress the issue that JIT warmup is not being considered though. After these changes, I consistently see that plain array is the fastest across all tests. But the capacity-ensured list is still slower by about 5-10% compared to default-capacity list for me across all 3 environments above. BUT some users seem to get the correct behaviour, so this might well be a very specific case.

EDIT2: If I use String instead of Integer as the element, the behavior is correct (plain array > pre-capacity-ensured arraylist > default capacity array). So I think autoboxing is actually the culprit.

like image 673
Hari Menon Avatar asked Apr 03 '13 14:04

Hari Menon


People also ask

Whose performance is better array or ArrayList?

An array is faster and that is because ArrayList uses a fixed amount of array. However when you add an element to the ArrayList and it overflows. It creates a new Array and copies every element from the old one to the new one.

Is ArrayList faster than LinkedList?

ArrayList is slow as array manipulation is slower. LinkedList is faster being node based as not much bit shifting required. ArrayList implements only List. LinkedList implements List as well as Queue.

How can an ArrayList improve performance?

In order to optimize the performance of ArrayLists, it is advisable to set a large enough initial capacity when initializing an ArrayList to incorporate all your data. This will allocate a large enough chunk of memory so that you will probably not need to perform the allocation process again.

Are ArrayLists slow?

ArrayList is fast because it is non-synchronized.


1 Answers

I've experimented with this a little, and my conclusion is that your benchmark is flawed. When I fix the most obvious issues, I get dramatically different results. My timings are as follows:

default list: 74ms
pre-sized list: 54ms
Integer array: 42ms
int array: 9ms

Note that these are in units of milliseconds. Your code performs measurements in tens of milliseconds: (t2-t1)/10.0.

For reference, the code I've used is as follows:

public class Clazz {

    static final int N = 1000000;

    interface Test {
        void test();
    }
    static final class DfltListTest implements Test {
        public void test() {
            for (int j = 0; j < 10; ++j) {
                List<Integer> list = new ArrayList<Integer>();
                for (int i = 0; i < N; ++i) {
                    list.add(i);
                }
            }
        }
    }
    static final class SizedListTest implements Test {
        public void test() {
            for (int j = 0; j < 10; ++j) {
                List<Integer> list = new ArrayList<Integer>(N);
                for (int i = 0; i < N; ++i) {
                    list.add(i);
                }
            }
        }
    }
    static final class IntegerArrayTest implements Test {
        public void test() {
            for (int j = 0; j < 10; ++j) {
                Integer[] arr = new Integer[N];
                for (int i = 0; i < N; ++i) {
                    arr[i] = i;
                }
            }
        }
    }
    static final class IntArrayTest implements Test {
        public void test() {
            for (int j = 0; j < 10; ++j) {
                int[] arr = new int[N];
                for (int i = 0; i < N; ++i) {
                    arr[i] = i;
                }
            }
        }
    }

    static void test(Test t, String name) {
        final int iter = 11;
        final long timings[] = new long[iter];
        for (int k = 0; k < iter; ++k) {
            long t1 = System.currentTimeMillis();
            t.test();
            long t2 = System.currentTimeMillis();
            timings[k] = t2 - t1;
            System.gc();
        }
        Arrays.sort(timings);
        System.out.printf("%s: %dms\n", name, timings[iter / 2]);
    }

    public static void main(String[] args) {
        for (int i = 0; i < 5; ++i) {
            test(new DfltListTest(), "default list");
            test(new SizedListTest(), "pre-sized list");
            test(new IntegerArrayTest(), "Integer array");
            test(new IntArrayTest(), "int array");
        }
    }
}

I've tested it using Java 1.7.0_09 with -XX:+AggressiveOpts -XX:CompileThreshold=1.

When I tested the same code using Java 6, the ranking was the same, but the difference between default list and pre-sized list was much more significant. I've not attempted to understand what it is about Java 7 that makes the difference so much smaller.

For some pointers on how to benchmark Java code, see How do I write a correct micro-benchmark in Java?

like image 192
NPE Avatar answered Sep 28 '22 07:09

NPE