I am looking to find the maximum of the calculated values inside of a for loop and also store its corresponding index, max_calc_value and i_max here, below is my pseudo code. I was wondering if it is possible to do a certain kind of reduction here:
double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel for
for (int i = 20; i < 1000; i++) {
this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
if (this_value > max_calc_value){
max_calc_value = this_value;
i_max = i;
}
}
If you feel like it, you can define a custom reduction function and use it in parallel. In your specific example, that might just make the code a bit more cumbersome than simply using a critical
section. However, this might apply nicely if your actual code can globally benefit from using this custom reduction function, not only for the final parallel reduction, but also for the local ones...
So in case it applies to you, here is an example on how it works:
#include <iostream>
#include <omp.h>
struct dbl_int {
double val;
int idx;
};
const dbl_int& max( const dbl_int& a, const dbl_int& b) {
return a.val > b.val ? a : b;
}
#pragma omp declare reduction( maxVal: dbl_int: omp_out=max( omp_out, omp_in ) )
int main() {
dbl_int di = { -100., -1 };
#pragma omp parallel num_threads( 10 ) reduction( maxVal: di )
{
di.val = omp_get_thread_num() % 7;
di.idx = omp_get_thread_num();
}
std::cout << "Upon exit, value=" << di.val << " and index=" << di.idx << std::endl;
return 0;
}
Which gives for me:
~/tmp $ g++ -fopenmp myred.cc -o myred
~/tmp $ ./myred
Upon exit, value=6 and index=6
The best way to handle it is to define a custom reduction operation as shown in Gilles' answer. If your compiler only supports OpenMP 3.1 or earlier (custom reduction operations were introduced in OpenMP 4.0), then the proper solution is to perform local reduction in each thread and then sequentially combine the local reductions:
double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel
{
int my_i_max = -1;
double my_value = -DBL_MAX;
#pragma omp for
for (int i = 20; i < 1000; i++) {
this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
if (this_value > my_value){
my_value = this_value;
my_i_max = i;
}
}
#pragma omp critical
{
if (my_value > max_calc_value) {
max_calc_value = my_value;
i_max = my_i_max;
}
}
}
This minimises the synchronisation overhead from the critical
construct and in a simplified way shows how the reduction
clause is actually implemented.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With