Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Design of a high-performance sorted data structure read by many threads and written by few

I have an interesting data structure design problem that is beyond my current expertise. I'm seeking data structure or algorithm answers about tackling this problem.

The requirements:

  • Store a reasonable number of (pointer address, size) pairs (effectively two numbers; the first is useful as a sorting key) in one location
  • In a highly threaded application, many threads will look up values, to see if a specific pointer is within one of the (address, size) pairs - that is, if the pair defines a memory range, if the pointer is within any range in the list. Threads will much more rarely add or remove entries from this list.
  • Reading or searching for values must be as fast as possible, happening hundreds of thousands to millions of times a second
  • Adding or removing values, ie mutating the list, happens much more rarely; performance is not as important
  • It is acceptable but not ideal for the list contents to be out of date, ie for a thread's lookup code to not find an entry that should exist, so long as at some point the entry will exist.

I am keen to avoid a naive implementation such as having a critical section to serialize access to a sorted list or tree. What data structures or algorithms might be suitable for this task?


Tagged with Delphi since I am using that language for this task. Language-agnostic answers are very welcome.

However, I probably cannot use any of the standard libraries in any language without a lot of care. The reason is that memory access (allocation, freeing, etc of objects and their internal memory, eg tree nodes, etc) is strictly controlled and must go through my own functions. My current code elsewhere in the same program uses red/black trees and a bit trie, and I've written these myself. Object and node allocation runs through custom memory allocation routines. It's beyond the scope of the question, but is mentioned here to avoid an answer like 'use STL structure foo.' I'm keen for an algorithmic or structure answer that, so long as I have the right references or textbooks, I can implement myself.

like image 395
David Avatar asked Nov 28 '13 13:11

David


People also ask

What sort of data structure would you use to store and Analyse a transaction network?

Using the graph-oriented data structures, the various entities related to a transaction may be analyzed and additional data associated with a transaction may be retrieved.

What are the 2 main types of data structures?

Static data structure: Static data structure has a fixed memory size. It is easier to access the elements in a static data structure. An example of this data structure is an array. Dynamic data structure: In dynamic data structure, the size is not fixed.

Which data structure is most appropriate for storing the user changes to a document?

Explanation: Stack data structure is most suitable to implement redo-undo feature. This is because the stack is implemented with LIFO(last in first out) order which is equivalent to redo-undo feature i.e. the last re-do is undo first.


2 Answers

I would use a TDictionary<Pointer, Integer> (from Generics.Collections) combined with a TMREWSync (from SysUtils) for the multi-read exclusive-write access. TMREWSync allows multiple readers simulatenous access to the dictionary, as long as no writer is active. The dictionary itself provides O(1) lookup of pointers.

If you don't want to use the RTL classes the answer becomes: use a hash map combined with a multi-read exclusive-write synchronization object.

EDIT: Just realized that your pairs really represent memory ranges, so a hash map does not work. In this case you could use a sorted list (sorted by memory adress) and then use binary search to quickly find a matching range. That makes the lookup O(log n) instead of O(1) though.

like image 141
jpfollenius Avatar answered Nov 06 '22 12:11

jpfollenius


Exploring a bit the replication idea ...

From the correctness point of view, reader/writer locks will do the work. However, in practice, while readers may be able to proceed concurrently and in parallel with accessing the structure, they will create a huge contention on the lock, for the obvious reason that locking even for read access involves writing to the lock itself. This will kill the performance in a multi-core system and even more in a multi-socket system.

The reason for the low performance is the cache line invalidation/transfer traffic between cores/sockets. (As a side note, here's a very recent and very interesting study on the subject Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask ).

Naturally, we can avoid inter core cache transfers, triggered by readers, by making a copy of the structure on each core and restricting the reader threads to accessing only the copy local to the core they are currently executing. This requires some mechanism for a thread to obtain its current core id. It also relies to on the operating system scheduler to not move gratuitously threads across cores, i.e. to maintain core affinity to some extent. AFACT, most current operating systems do it.

As for the writers, their job would be to update all the existing replicas, by obtaining each lock for writing. Updating one tree (apparently the structure should be some tree) at a time does mean a temporary inconsistency between replicas. From the problem description this seams acceptable. When a writer works, it will block readers on a single core, but not all readers. The drawback is that a writer has the perform the same work many times - as many time as there are cores or sockets in the system.

PS.

Maybe, just maybe, another alternative is some RCU-like approach, but I don't know this well, so I'll just stop after mentioning it :)

like image 23
chill Avatar answered Nov 06 '22 12:11

chill