Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the intuition behind cache oblivious data structures?

I understand what the expression cache oblivious means. But I was wondering if there is any easy explanation for how data structures can be designed that can use the cache optimally, without knowing the sizes of the cache.

Can you please provide such an explanation, preferably with an (easy) example?

like image 986
Muhammad Alkarouri Avatar asked Dec 28 '22 07:12

Muhammad Alkarouri


1 Answers

Even an algorithm as familiar as quicksort is somewhat cache oblivious (but not optimal). Recall that it works by partitioning the array, then recursing on each side of the partition. Eventually, it is operating on a sub-array which fits in cache, and so there will be no more cache misses until it finishes that sub-array and moves on to another one. That's the property we're looking for.

Contrast this with insertion sort, which (to use a technical term) leaps all over the place all the time. So quite aside from insertion sort's need to move O(n^2) items around, it also misses cache a lot when used on large arrays.

Quicksort is some way from optimal, though. Each individual partition phase doesn't divide and recurse - it does a long sequential run through memory churning the cache. Potentially this will happen several times before the sub-array size is small enough that we start winning, so we're not minimising the number of cache misses.

like image 93
Steve Jessop Avatar answered Mar 07 '23 21:03

Steve Jessop