Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure for O(log N) find and update, considering small L1 cache

I'm currently working on an embedded device project where I'm running into performance problems. Profiling has located an O(N) operation that I'd like to eliminate.

I basically have two arrays int A[N] and short B[N]. Entries in A are unique and ordered by external constraints. The most common operation is to check if a particular value a appears in A[]. Less frequently, but still common is a change to an element of A[]. The new value is unrelated to the previous value.

Since the most common operation is the find, that's where B[] comes in. It's a sorted array of indices in A[], such that A[B[i]] < A[B[j]] if and only if i<j. That means that I can find values in A using a binary search.

Of course, when I update A[k], I have to find k in B and move it to a new position, to maintain the search order. Since I know the old and new values of A[k], that's just a memmove() of a subset of B[] between the old and new position of k. This is the O(N) operation that I need to fix; since the old and new values of A[k] are essentially random I'm moving on average about N/2 N/3 elements.

I looked into std::make_heap using [](int i, int j) { return A[i] < A[j]; } as the predicate. In that case I can easily make B[0] point to the smallest element of A, and updating B is now a cheap O(log N) rebalancing operation. However, I generally don't need the smallest value of A, I need to find if any given value is present. And that's now a O(N log N) search in B. (Half of my N elements are at heap depth log N, a quarter at (log N)-1, etc), which is no improvement over a dumb O(N) search directly in A.

Considering that std::set has O(log N) insert and find, I'd say that it should be possible to get the same performance here for update and find. But how do I do that? Do I need another order for B? A different type?

B is currently a short [N] because A and B together are about the size of my CPU cache, and my main memory is a lot slower. Going from 6*N to 8*N bytes would not be nice, but still acceptable if my find and update go to O(log N) both.

like image 840
MSalters Avatar asked May 11 '12 12:05

MSalters


People also ask

Which data structure can remove and end in O 1 time?

Answer:Deleting the top element of a stack is O(1), which is valid because you only have access to the top of the stack.

Which of these data structures provides the most consistent performance for both insert and search operations?

Binary Search Trees (BST)

Which of the following data structures can provide a contant time lookup?

There are few things that make the array a key data structure, Array provides constant access time.


2 Answers

If the only operations are (1) check if value 'a' belongs to A and (2) update values in A, why don't you use a hash table in place of the sorted array B? Especially if A does not grow or shrink in size and the values only change this would be a much better solution. A hash table does not require significantly more memory than an array. (Alternatively, B should be changed not to a heap but to a binary search tree, that could be self-balancing, e.g. a splay tree or a red-black tree. However, trees require extra memory because of the left- and right-pointers.)

A practical solution that grows memory use from 6N to 8N bytes is to aim for exactly 50% filled hash table, i.e. use a hash table that consists of an array of 2N shorts. I would recommend implementing the Cuckoo Hashing mechanism (see http://en.wikipedia.org/wiki/Cuckoo_hashing). Read the article further and you find that you can get load factors above 50% (i.e. push memory consumption down from 8N towards, say, 7N) by using more hash functions. "Using just three hash functions increases the load to 91%."

From Wikipedia:

A study by Zukowski et al. has shown that cuckoo hashing is much faster than chained hashing for small, cache-resident hash tables on modern processors. Kenneth Ross has shown bucketized versions of cuckoo hashing (variants that use buckets that contain more than one key) to be faster than conventional methods also for large hash tables, when space utilization is high. The performance of the bucketized cuckoo hash table was investigated further by Askitis, with its performance compared against alternative hashing schemes.

like image 184
Antti Huima Avatar answered Oct 22 '22 11:10

Antti Huima


std::set usually provides the O(log(n)) insert and delete by using a binary search tree. This unfortunately uses 3*N space for most pointer based implementations. Assuming word sized data, 1 for data, 2 for pointers to left and right child on each node.

If you have some constant N and can guarantee that ceil(log2(N)) is less than half the word size you can use a fixed length array of tree nodes each 2*N size. Use 1 for data, 1 for the indexes of the two child nodes, stored as the upper and lower half of the word. Whether this would let you use a self balancing binary search tree of some manner depends on your N and word size. For a 16 bit system you only get N = 256, but for 32 its 65k.

like image 31
Adam Douglass Avatar answered Oct 22 '22 11:10

Adam Douglass