Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why start an ArrayList with an initial capacity?

People also ask

What is initial capacity in ArrayList?

An ArrayList has an initial capacity which is simply the size of the array used to store the elements in the list. When you create an ArrayList you can specify the initial capacity. For example: ArrayList<Integer> arrayList = new ArrayList<>(100); In this case, the initial capacity of the ArrayList will be 100.

Does ArrayList size start at 0 or 1?

The ArrayList index starts at 0 just like arrays, but instead of using the square brackets [] to access elements, you use the get(index) to get the value at the index and set(index,value) to set the element at an index to a new value.

Can you initialize an ArrayList with size?

To create an ArrayList of specific size, you can pass the size as argument to ArrayList constructor while creating the new ArrayList. Following the syntax to create an ArrayList with specific size. myList = new ArrayList<T>(N); where N is the capacity with which ArrayList is created.

What is difference between capacity and size of ArrayList?

The size of the list is the number of elements in it. The capacity of the list is the number of elements the backing data structure can hold at this time. The size will change as elements are added to or removed from the list.


If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.

The larger the final list, the more time you save by avoiding the reallocations.

That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).


Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.

So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.

P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)


Default size of Arraylist is 10.

/**
 * Constructs an empty list with an initial capacity of ten.
 */
public ArrayList() {
    this(10);
}   

So if you are going to add 100 or more records, you can see the overhead of memory reallocation.

ArrayList<?> list = new ArrayList<>();    
// same as  new ArrayList<>(10);      

So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.


I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.

When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.

Example

Here is an example of how the ArrayList would increase in size:

10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308

So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.

1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations

Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).

1000000 + 1000000 = 2000000 operations

Java vs C#

Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.

References

  • My blog post on C#'s List<T>
  • Java's ArrayList source code