Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use OpenMP in C++11 to find the maximum of the calculated values

Tags:

c++

c++11

openmp

I am looking to find the maximum of the calculated values inside of a for loop and also store its corresponding index, max_calc_value and i_max here, below is my pseudo code. I was wondering if it is possible to do a certain kind of reduction here:

double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel for
for (int i = 20; i < 1000; i++) {
    this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
    if (this_value > max_calc_value){
        max_calc_value = this_value;
        i_max = i;
    }
}
like image 287
sap Avatar asked Dec 24 '22 00:12

sap


2 Answers

If you feel like it, you can define a custom reduction function and use it in parallel. In your specific example, that might just make the code a bit more cumbersome than simply using a critical section. However, this might apply nicely if your actual code can globally benefit from using this custom reduction function, not only for the final parallel reduction, but also for the local ones... So in case it applies to you, here is an example on how it works:

#include <iostream>
#include <omp.h>

struct dbl_int {
    double val;
    int idx;
};

const dbl_int& max( const dbl_int& a, const dbl_int& b) {
    return a.val > b.val ? a : b;
}

#pragma omp declare reduction( maxVal: dbl_int: omp_out=max( omp_out, omp_in ) )

int main() {
    dbl_int di = { -100., -1 };
    #pragma omp parallel num_threads( 10 ) reduction( maxVal: di )
    {
        di.val = omp_get_thread_num() % 7;
        di.idx = omp_get_thread_num();
    }
    std::cout << "Upon exit, value=" << di.val << " and index=" << di.idx << std::endl;
    return 0;
}

Which gives for me:

~/tmp $ g++ -fopenmp myred.cc -o myred
~/tmp $ ./myred
Upon exit, value=6 and index=6
like image 152
Gilles Avatar answered Dec 26 '22 14:12

Gilles


The best way to handle it is to define a custom reduction operation as shown in Gilles' answer. If your compiler only supports OpenMP 3.1 or earlier (custom reduction operations were introduced in OpenMP 4.0), then the proper solution is to perform local reduction in each thread and then sequentially combine the local reductions:

double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel
{
    int my_i_max = -1;
    double my_value = -DBL_MAX;

    #pragma omp for
    for (int i = 20; i < 1000; i++) {
        this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
        if (this_value > my_value){
            my_value = this_value;
            my_i_max = i;
        }
    }

    #pragma omp critical
    {
        if (my_value > max_calc_value) {
            max_calc_value = my_value;
            i_max = my_i_max;
        }
    }
}

This minimises the synchronisation overhead from the critical construct and in a simplified way shows how the reduction clause is actually implemented.

like image 30
Hristo Iliev Avatar answered Dec 26 '22 14:12

Hristo Iliev