I have a data structure which consists of 1,000 array elements, each array element is a smaller array of 8 ints:
std::array<std::array<int, 8>, 1000>
The data structure contains two "pointers", which track the largest and smallest populated array elements (within the "outer", 1000-element array). So for example they might be:
min = 247
max = 842
How can I read and write to this data structure from multiple threads? I am worried about race conditions between pushing/popping elements and maintaining the two "pointers". My basic mode of operation is:
// Pop element from current index
// Calculate new index
// Write element to new index
// Update min and max "pointers"
You are correct that your current algorithm is not thread safe, there are a number of places where contention could occur.
This is impossible to optimize without more information though. You need to know where the slow-down is happening before you can improve it - and for that you need metrics. Profile your code and find out what bits are actually taking the time, because you can only gain by parallelizing those bits and even then you may find that it's actually memory or something else that is the limiting factor, not CPU.
The simplest approach will then be to just lock the entire structure for the full process. This will only work if the threads are doing a lot of other processing as well, if not you will actually lose performance compared to single threading.
After that you can consider having a separate lock for different sections of the data structure. You will need to properly analyse what you are using when and where and work out what would be useful to split. For example you might have chunks of the sub arrays with each chunk having its own lock.
Be careful of deadlocks in this situation though, you might have a thread claim 32 then want 79 while another thread already has 79 and then wants 32. Make sure you always claim locks in the same order.
The fastest option (if possible) may even be to give each thread it's own copy of the data structure, each processes 1/N of the work and then merge the results at the end. This way no synchronization is needed at all during processing.
But again it all comes back to the metrics and profiling. This is not a simple problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With