Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multithreaded array of arrays?

I have a data structure which consists of 1,000 array elements, each array element is a smaller array of 8 ints:

std::array<std::array<int, 8>, 1000>

The data structure contains two "pointers", which track the largest and smallest populated array elements (within the "outer", 1000-element array). So for example they might be:

min = 247 
max = 842

How can I read and write to this data structure from multiple threads? I am worried about race conditions between pushing/popping elements and maintaining the two "pointers". My basic mode of operation is:

// Pop element from current index
// Calculate new index
// Write element to new index
// Update min and max "pointers"
like image 456
user997112 Avatar asked Oct 05 '15 11:10

user997112


1 Answers

You are correct that your current algorithm is not thread safe, there are a number of places where contention could occur.

This is impossible to optimize without more information though. You need to know where the slow-down is happening before you can improve it - and for that you need metrics. Profile your code and find out what bits are actually taking the time, because you can only gain by parallelizing those bits and even then you may find that it's actually memory or something else that is the limiting factor, not CPU.

The simplest approach will then be to just lock the entire structure for the full process. This will only work if the threads are doing a lot of other processing as well, if not you will actually lose performance compared to single threading.

After that you can consider having a separate lock for different sections of the data structure. You will need to properly analyse what you are using when and where and work out what would be useful to split. For example you might have chunks of the sub arrays with each chunk having its own lock.

Be careful of deadlocks in this situation though, you might have a thread claim 32 then want 79 while another thread already has 79 and then wants 32. Make sure you always claim locks in the same order.

The fastest option (if possible) may even be to give each thread it's own copy of the data structure, each processes 1/N of the work and then merge the results at the end. This way no synchronization is needed at all during processing.

But again it all comes back to the metrics and profiling. This is not a simple problem.

like image 135
Tim B Avatar answered Nov 01 '22 17:11

Tim B