Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i lock a MUTEX for an element in the array, not for the complete array

Short version of the question: I have 2 functions that share the same array, when one is editing it, the other is reading it. However, the vector is long (5000 samples) and concurrent access rarely happens. but the Mutex contention on MUTEX1 is slowing down the program. '

How can I lock certain locations of the memory instead of the complete block in order to reduce contention?

EDIT: Note: I have to use updated G values whenever possible.

EDIT2: For example I have array G of length 5000. foo1 locks mutex1 to edit index 124. Although foo2 wants to edit index 2349, it cannot until foo1 releases mutex1.

is there a way I can move the contention of locking a mutex down to the element level? meaning: I want foo2 and foo1 to only contest on a the same mutex, only when they want to edit the same index. E.g: foo1 wants to edit index 3156, and foo2 wants to edit index 3156.

Long version with code explanation: I am writing a code for a complex mathematical function, and I am using pthreads to parallel the code and enhance the performance. The code is very complex and I can post it but I can post a model to the code.

Basically I have 2 arrays that I want to edit using 2 threads that run in parallel. One thread runs foo1 and the other runs foo2. However, they should run in a particular sequence and I use mutexes(_B,_A1, and _A2) to grantee the sequence. it goes as follows :

foo1 (first half)
foo2 (first half) and foo1 (second half) (in parallel)
foo1 (first half) and foo2 (second half) (in parallel)
...
foo2(second half)

then I would retrieve my results. In the first half of foo1 I will be using results in G1 that is might be edited at the same time by foo2. Therefore I use Mutex1 to protect it. same happens in foo2 for G. However, locking the complete vector for 1 value is very in efficient, they nearly never edit the same memory location at the same time. when I compare the results, it is almost always the same. I would like a way to lock one element at a time, so that they only contest the same element.

I will describe the code for people interested to know how it works:

#include <pthread.h>
#include <iostream>

using namespace std;

#define numThreads 2
#define Length 10000

pthread_t threads[numThreads];

pthread_mutex_t mutex1   = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_B  = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_A1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t Mutex_A2 = PTHREAD_MUTEX_INITIALIZER;

struct data_pointers
{
    double  *A;
    double  *B;
    double  *G;
    double  *L;
    int idxThread;
};

void foo1   (data_pointers &data);
void foo2   (data_pointers &data);

void *thread_func(void *arg){
    data_pointers data = *((data_pointers *) arg);
    if (data.idxThread==0)
        foo1 (data);
    else
        foo2 (data);
}

Up to here it is definitions and thread calling function, bare in mind that I define Length 10000 and numThreads 2

void foo1 ( data_pointers &data)
{
    double *A           = data.A;
    double *L           = data.L; 
    double *G           = data.G; 
    double U;

    for (int ijk =0;ijk<5;ijk++){
        /* here goes some definitions*/

        pthread_mutex_lock(&Mutex_A1);

        for (int k =0;k<Length;k++){
            pthread_mutex_lock(&mutex1); 
            U = G[k];
            pthread_mutex_unlock(&mutex1);
            /*U undergoes a lot of mathematical operations here


            */
        }

        pthread_mutex_lock(&Mutex_B);
        pthread_mutex_unlock(&Mutex_A2);
        for (int k =0;k<Length;k++){
            /*U another mathematical operations here


            */
            pthread_mutex_lock(&mutex1);
            L[k] = U;
            pthread_mutex_unlock(&mutex1);
            pthread_mutex_unlock(&Mutex_B);
        }
    }
}

in foo1 I lock mutexA1 and complete my work, then I lock MutexB and unlock MutexA2 so foo2 can start working. Note that main starts by locking MutexA2. This way I garantee foo1 started second half with mutexB locked, this way, foo2 cannot enter the second half of the function until foo1 unlocks mutexB

void foo2 (data_pointers &data)
{
    double *A           = data.A;
    double *L           = data.L; 
    double *G           = data.G; 
    double U;

    for (int ijk =0;ijk<5;ijk++){
        /* here goes some definitions*/

        pthread_mutex_lock(&Mutex_A1);

        for (int k =0;k<Length;k++){
            pthread_mutex_lock(&mutex1); 
            U = G[k];
            pthread_mutex_unlock(&mutex1);
            /*U undergoes a lot of mathematical operations here


            */
        }

        pthread_mutex_lock(&Mutex_B);
        pthread_mutex_unlock(&Mutex_A2);
        for (int k =0;k<Length;k++){        
            /*U another mathematical operations here


            */
            pthread_mutex_lock(&mutex1);
            L[k] = U;
            pthread_mutex_unlock(&mutex1);
            pthread_mutex_unlock(&Mutex_B);

        }
    }
}

Now, when foo1 unlocks mutexB it will have to wait for foo2 to unlock mutexA1 so it can work, foo2 will only unlock mutexA2 when it already unlocked mutexB.

this goes on and on 5 times.

int main(){
    double G1[Length];
    double G2[Length];
    double B1[Length];
    double B2[Length];
    double A2[Length];
    double A1[Length];
    data_pointers data[numThreads];

    data[0].L           = G2;
    data[0].G           = G1;   
    data[0].A           = A1;
    data[0].B           = B1;
    data[0].idxThread   = 0;

    data[1].L           = G1;
    data[1].G           = G2;   
    data[1].A           = A2;
    data[1].B           = B2;
    data[1].idxThread   = 1;

    pthread_mutex_lock(&Mutex_A2);

    pthread_create(&(threads[0]), NULL, thread_func, (void *) &(data[0]));
    pthread_create(&(threads[1]), NULL, thread_func, (void *) &(data[1]));
    pthread_join(threads[1], NULL);
    pthread_join(threads[0], NULL);

    pthread_mutex_unlock(&Mutex_A1);
    pthread_mutex_unlock(&Mutex_A2);

    return 0;
}

note this is only an example code. compiles and works as intended, but with no output.

LAST EDIT: Thank you all for the great ideas, I had a lot of experience, and fun following those suggestions. I will up vote all answers as they were useful, and pick the closest to the original question (atomicity)

like image 577
user304584 Avatar asked Sep 17 '15 16:09

user304584


2 Answers

If you do not resize your arrays, you do not need any mutexes on individual elements or whole array.

Read your values atomically, write your values atomically and stay calm.

like image 24
SergeyA Avatar answered Oct 27 '22 00:10

SergeyA


Sample code of using an atomic pointer to 'lock' certain locations in memory:

#include <vector>
#include <atomic>
#include <thread>

using container = std::vector<std::atomic<double>>;
using container_size_type = container::size_type;

container c(300);

std::atomic<container::pointer> p_busy_elem{ nullptr };

void editor()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        p_busy_elem.exchange(&c[i]); // c[i] is busy
        // ... edit c[i] ... // E: calculate a value and assign it to c[i]
        p_busy_elem.exchange(nullptr); // c[i] is no longer busy
    }
}

void reader()
{
    for (container_size_type i{ 0 }, sz{ c.size() }; i < sz; ++i)
    {
        // A1: wait for editor thread to finish editing value
        while (p_busy_elem == &c[i])
        {
            // A2: room a better algorithm to prevent blocking/yielding
            std::this_thread::yield();
        }

        // B: if c[i] is updated in between A and B, this will load the latest value
        auto value = c[i].load();

        // C: c[i] might have changed by this time, but we had the most up to date value we could get without checking again
        // ... use value ...
    }
}

int main()
{
    std::thread t_editor{ editor };
    std::thread t_reader{ reader };
    t_editor.join();
    t_reader.join();
}

In the editor thread, the busy pointer is set to indicate that that memory location is currently being edited (E). If thread B attempts to read that value after the busy pointer is set, it will wait until the editing is done before proceeding (A1).

Note on A2: A better system could be placed here. A list of nodes that were busy when a read was attempted could be kept, we would then add i to that list and attempt to process the list at a later time. Benefit: the loop could be told to execute a continue and indices past the currently being edited i would be read.

A copy of the value to read is made (B) in order to use it (C) however needed. This is the last time we can check for the latest value at c[i].

like image 147
bku_drytt Avatar answered Oct 27 '22 00:10

bku_drytt