Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parallelize do while and while loop in openmp?

I'm trying to learn parallel programming with OpenMP and I'm interested in parallelizing the following do while loop with several while loop inside it:

do {
        while(left < (length - 1) && data[left] <= pivot) left++;
        while(right > 0 && data[right] >= pivot) right--;

        /* swap elements */
        if(left < right){
            temp = data[left];
            data[left] = data[right];
            data[right] = temp;
        }

    } while(left < right);

I haven't actually figured out how to parallelize while and do while loops, couldn't find any resource where it specifically describes how to parallelize while and do while loops. I have found instructions for for loops, but I couldn't make any assumption for while and do while loops from that. So, could you please describe how I can parallelize this loops that I provided here?

EDIT

I have transformed the do while loop to the following code where only for loop is used.

for(i = 1; i<length-1; i++)
{
    if(data[left] > pivot)
    {
        i = length;
    }
    else
    {
        left = i;
    }

}

for(j=length-1; j > 0; j--)
{
    if(data[right] < pivot)
    {
        j = 0;
    }
    else
    {
        right = j;
    }
}

/* swap elements */
if(left < right)
{
    temp = data[left];
    data[left] = data[right];
    data[right] = temp;
}

int leftCopy = left;
int rightCopy = right;

for(int leftCopy = left; leftCopy<right;leftCopy++)
{
    for(int new_i = left; new_i<length-1; new_i++)
    {
        if(data[left] > pivot)
        {
            new_i = length;
        }
        else
        {
            left = new_i;
        }
    }

    for(int new_j=right; new_j > 0; new_j--)
    {
        if(data[right] < pivot)
        {
            new_j = 0;
        }
        else
        {
            right = new_j;
        }
    }
    leftCopy = left;
    /* swap elements */
    if(left < right)
    {
        temp = data[left];
        data[left] = data[right];
        data[right] = temp;
    }
}

This code works fine and produces correct result, but when I tried to parallelize the parts of above stated code, by changing the first two for loops to the following:

#pragma omp parallel default(none) firstprivate(left) private(i,tid) shared(length, pivot, data)
    {
#pragma omp for
        for(i = 1; i<length-1; i++)
        {
            if(data[left] > pivot)
            {
                i = length;
            }
            else
            {
                left = i;
            }
        }
    }


#pragma omp parallel default(none) firstprivate(right) private(j) shared(length, pivot, data)
    {
#pragma omp for
        for(j=length-1; j > 0; j--)
        {
            if(data[right] < pivot)
            {
                j = 0;
            }
            else
            {
                right = j;
            }
        }
    }

The speed is worse than the non-parallelized code. Please help me identify my problem.

Thanks

like image 790
the_naive Avatar asked Nov 03 '14 13:11

the_naive


1 Answers

First of all, sorting algorithms are very hard to parallelize with OpenMP parallel loops. This is because the loop trip count is not deterministic but depends on the input set values that are read every iteration.

I don't think having loop conditions such as data[left] <= pivot is going to work well, since OpenMP library does not know exactly how to partition the iteration space among the threads.

If you are still interested in parallel sorting algorithms, I suggest you to read the literature first, to see those algorithms that really worth implementing due to their scalability. If you just want to learn OpenMP, I suggest you start with easier algorithms such as bucket-sort, where the number of buckets is well known and does not frequently change.

Regarding the example you try to parallelize, while loops are not directly supported by OpenMP because the number of iterations (loop trip count) is not deterministic (otherwise, it is easy to transform them into for loops). Therefore, it is not possible to distribute the iterations among the threads. In addition, it is common for while loops to check for a condition using last iteration's result. This is called Read-after-Write or true-dependency and cannot be parallelized.

Your slowdown problem might be alleviated if you try to minimize the number of omp parallel clauses. In addition, try to move them out of all your loops. These clauses may create and join the additional threads that are used in the parallel parts of the code, which is expensive.

You can still synchronize threads inside parallel blocks, so that the outcome is similar. In fact, all threads wait for each other at the end of a omp for clause by default, so that this makes things even easier.

#pragma omp parallel default(none) firstprivate(right,left) private(i,j) shared(length, pivot, data)
{
    #pragma omp for
    for(i = 1; i<length-1; i++)
    {
        if(data[left] > pivot)
        {
            i = length;
        }
        else
        {
            left = i;
        }
    }

    #pragma omp for 
    for(j=length-1; j > 0; j--)
    {
        if(data[right] < pivot)
        {
            j = 0;
        }
        else
        {
            right = j;
        }
    }
} // end omp parallel 
like image 83
Jorge Bellon Avatar answered Oct 13 '22 11:10

Jorge Bellon