Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array's lookup time complexity vs. how it is stored

It's well known that the time complexity of array access by index is O(1).

The documentation of Java's ArrayList, which is backed by an array, says the same about its get operation:

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time.

The lookup is done by getting the memory address of the element at a given index independently of the array's size (something like start_address + element_size * index). My understanding is that the array's elements have to be stored next to each other in the memory for this lookup mechanism to be possible.

However, from this question, I understand that arrays in Java are not guaranteed to have their elements stored contiguously in the memory. If that is the case, how could it always be O(1)?


Edit: I'm quite aware of how an ArrayList works. My point is, if contiguous storage for arrays is not guaranteed by the JVM specification, its elements could be at different areas in the memory. Although that situation is highly unlikely, it would render the lookup mechanism mentioned above impossible, and the JVM would have another way to do the lookup, which shouldn't be O(1) anymore. At that point, it would be against both the common knowledge stated at the top of this question and the documentation of ArrayList regarding its get operation.

Thanks everybody for your answers.


Edit: In the end, I think it's a JVM-specific thing but most, if not all, JVM's stick to contiguous storage of an array's elements even when there's no guarantee, so that the lookup mechanism above can be used. It's simple, efficient and cost-effective.

As far as I can see, it would be silly to store the elements all over the place and then have to take a different approach to doing the lookup.

like image 681
Luciano Vercetti Avatar asked Nov 01 '16 09:11

Luciano Vercetti


People also ask

What is the best and worst case time complexity for searching for a value in a linked list?

To search a linked list, you are going to iterate over each item in the list. The most time this will take, will be T(n) where n is the length of your list. A big-O notation stands for the upper bounds or the worst case scenario.

What is the time complexity of lookup?

The hash table lookup takes Θ(k) time if the hash calculation is done in linear time in the length of its input, which is typical for hash functions, and the lookup of the value takes O(k) time.

How do you find the time complexity of a linked list?

Time Complexity of Linked List. From this article on time complexity of memory address, we known that to access a specific element, the time complexity is O(√N) where N is block of continuous elements being read. As Linked List elements are not contiguous, each element access incur a Time Complexity of O(√N).


3 Answers

As far as I'm aware, the spec gives no guarantee that arrays will be stored contiguously. I'd speculate that most JVM implementations will though. In the basic case it's simple enough to enforce: if you can't expand the array because other memory is occupying the space you need, move the whole thing somewhere else.

Your confusion stems from a misunderstanding of the meaning of O(1). O(1) does not mean that a function is performed in a single operation (e.g. start_address + element_size * index). It means that the operation is performed in a constant amount of time irrespective of the size of the input data - in this case, the array. This is perfectly achievable with data that is not stored contiguously. You could have a lookup table mapping indexes to memory locations, for example.

like image 140
Michael Avatar answered Nov 01 '22 22:11

Michael


From the linked question you can see that even though it's not mandated by the JVM rules, it's highly likely that 1D arrays are continuous in memory.

Given a contiguous array the time complexities of ArrayList are as given. However it's not impossible that in a special case or a special JVM the complexities might be slightly different. It's impossible to provide the time complexities if you have to consider all kinds of VMs that are allowed by the spec.

like image 23
Kayaman Avatar answered Nov 01 '22 22:11

Kayaman


Everytime, an element is added, its capacity is checked: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/ArrayList.java#ArrayList.add%28java.lang.Object%29

public boolean add(E e) {
   ensureCapacity(size + 1);  // Increments modCount!!
   elementData[size++] = e;
   return true;
}

Here, ensureCapacity() does the trick to keep the array sequential. How? http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/ArrayList.java#ArrayList.ensureCapacity%28int%29

public void ensureCapacity(int minCapacity) {
        modCount++;
        int oldCapacity = elementData.length;
        if (minCapacity > oldCapacity) {
           Object oldData[] = elementData;
        int newCapacity = (oldCapacity * 3)/2 + 1;
        if (newCapacity < minCapacity)
           newCapacity = minCapacity;
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
}

Thus, in every stage it tries to ensure that the array has enough capacity and is linear i.e. any index within the range can be retrieved in O(1).

like image 23
thepace Avatar answered Nov 01 '22 22:11

thepace