I am experimenting with OpenMP. I wrote some code to check its performance. On a 4-core single Intel CPU with Kubuntu 11.04, the following program compiled with OpenMP is around 20 times slower than the program compiled without OpenMP. Why?
I compiled it by g++ -g -O2 -funroll-loops -fomit-frame-pointer -march=native -fopenmp
#include <math.h>
#include <iostream>
using namespace std;
int main ()
{
long double i=0;
long double k=0.7;
#pragma omp parallel for reduction(+:i)
for(int t=1; t<300000000; t++){
for(int n=1; n<16; n++){
i=i+pow(k,n);
}
}
cout << i<<"\t";
return 0;
}
The problem is that the variable k is considered to be a shared variable, so it has to be synced between the threads. A possible solution to avoid this is:
#include <math.h>
#include <iostream>
using namespace std;
int main ()
{
long double i=0;
#pragma omp parallel for reduction(+:i)
for(int t=1; t<30000000; t++){
long double k=0.7;
for(int n=1; n<16; n++){
i=i+pow(k,n);
}
}
cout << i<<"\t";
return 0;
}
Following the hint of Martin Beckett in the comment below, instead of declaring k inside the loop, you can also declare k const and outside the loop.
Otherwise, ejd is correct - the problem here does not seem bad parallelization, but bad optimization when the code is parallelized. Remember that the OpenMP implementation of gcc is pretty young and far from optimal.
Fastest code:
for (int i = 0; i < 100000000; i ++) {;}
Slightly slower code:
#pragma omp parallel for num_threads(1)
for (int i = 0; i < 100000000; i ++) {;}
2-3 times slower code:
#pragma omp parallel for
for (int i = 0; i < 100000000; i ++) {;}
no matter what it is in between { and }. A simple ; or a more complex computation, same results. I compiled under Ubuntu 13.10 64-bit, using both gcc and g++, trying different parameters -ansi -pedantic-errors -Wall -Wextra -O3, and running on an Intel quad-core 3.5GHz.
I guess thread management overhead is at fault? It doens't seem smart for OMP to create a thread everytime you need one and destroy it after. I thought there would be four (or eight) threads being either running whenever needed or sleeping.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With