Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does cache locality matter for array performance?

In the following blog there is a statement about the advantage of arrays over linked lists:

Arrays have better cache locality that can make a pretty big difference in performance.

What does that mean? I don't understand how cache locality can provide a huge performance benefit.

like image 758
Vaibhav Mishra Avatar asked Aug 22 '12 02:08

Vaibhav Mishra


People also ask

Why do arrays have better cache locality?

Also, better cache locality in arrays (due to contiguous memory allocation) can significantly improve performance. As a result, some operations (such as modifying a certain element) are faster in arrays, while others (such as inserting/deleting an element in the data) are faster in linked lists.

Is cache locality important?

Without spatial locality, memory requests would be uniformly distributed in the memory of the application, and then given the very large ratios between memory size and cache size, the chance of hitting in the cache would be microscopic (about one in one million for the example above).

What is meant by locality and why is it important to caching?

Temporal locality means current data or instruction that is being fetched may be needed soon. So we should store that data or instruction in the cache memory so that we can avoid again searching in main memory for the same data.

Are you have better cache locality that can make them better in terms of performance?

Arrays have better cache locality that can make them better in terms of performance. The size of array has to be pre-decided, linked lists can change their size any time.


1 Answers

See my answer about spatial and temporal locality.

In particular, arrays are contiguous memory blocks, so large chunks of them will be loaded into the cache upon first access. This makes it comparatively quick to access future elements of the array. Linked lists on the other hand aren't necessarily in contiguous blocks of memory, and could lead to more cache misses, which increases the time it takes to access them.

Consider the following possible memory layouts for an array data and linked list l_data of large structs

Address      Contents       | Address      Contents ffff 0000    data[0]        | ffff 1000    l_data ffff 0040    data[1]        |   .... ffff 0080    data[2]        | ffff 3460    l_data->next ffff 00c0    data[3]        |   .... ffff 0100    data[4]        | ffff 8dc0    l_data->next->next                             | ffff 8e00    l_data->next->next->next                             |   ....                             | ffff 8f00    l_data->next->next->next->next 

If we wanted to loop through this array, the first access to ffff 0000 would require us to go to memory to retrieve (a very slow operation in CPU cycles). However, after the first access the rest of the array would be in the cache, and subsequent accesses would be much quicker. With the linked list, the first access to ffff 1000 would also require us to go to memory. Unfortunately, the processor will cache the memory directly surrounding this location, say all the way up to ffff 2000. As you can see, this doesn't actually capture any of the other elements of the list, which means that when we go to access l_data->next, we will again have to go to memory.

like image 98
brc Avatar answered Oct 15 '22 12:10

brc