Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMP and C parallel for loop: why does my code slow down when using OpenMP?

Tags:

c

gcc

openmp

I'm new here and a beginner level programmer in C. I'm having some problem with using openmp to speedup the for-loop. Below is simple example:

#include <stdlib.h>
#include <stdio.h>
#include <gsl/gsl_rng.h>
#include <omp.h>

gsl_rng *rng;

main()
{
int i, M=100000000;
double tmp;

/* initialize RNG */
gsl_rng_env_setup();
rng = gsl_rng_alloc (gsl_rng_taus);
gsl_rng_set (rng,(unsigned long int)791526599);

// option 1: parallel        
  #pragma omp parallel for default(shared) private( i, tmp ) schedule(dynamic)
  for(i=0;i<=M-1;i++){
     tmp=gsl_ran_gamma_mt(rng, 4, 1./3 );
  }


// option 2: sequential       
  for(i=0;i<=M-1;i++){
     tmp=gsl_ran_gamma_mt(rng, 4, 1./3 );
  }
}

The code draws from a gamma random distribution for M iterations. It turns out the parallel approach with openmp (option 1) takes about 1 minute while the sequential approach (option 2) takes only 20 seconds. While running with openmp, I can see the cpu usage is 800% ( the server I'm using has 8 CPUs ). And the system is linux with GCC 4.1.3. The compile command I'm using is gcc -fopenmp -lgsl -lgslcblas -lm (I'm using GSL )

Am I doing something wrong? Please help me! Thanks!

P.S. As pointed out by some users, it might be caused by rng. But even if I replace

tmp=gsl_ran_gamma_mt(rng, 4, 1./3 );

by say

tmp=1000*10000;

the problem still there...

like image 771
user1620200 Avatar asked Aug 23 '12 15:08

user1620200


People also ask

Why is OpenMP slower?

Another partial reason for the code to run slower might be that when OpenMP is enabled, the compiler might become reluctant to apply some code optimisations when shared variables are being assigned to.

Is OpenMP multithreaded?

The OpenMP standard was formulated in 1997 as an API for writing portable, multithreaded applications. It started as a Fortran-based standard, but later grew to include C and C++. The current version is OpenMP 2.0, and Visual C++® 2005 supports the full standard.

How does OpenMP parallel for work?

OpenMP in a nutshell Parallel code with OpenMP marks, through a special directive, sections to be executed in parallel. The part of the code that’s marked to run in parallel will cause threads to form. The main tread is the master thread. The slave threads all run in parallel and run the same code.


3 Answers

gsl_ran_gamma_mt probably locks on rng to prevent concurrency issues (if it didn’t, your parallel code probably contains a race condition and thus yields wrong results). The solution then would be to have a separate rng instance for each thread, thus avoiding locking.

like image 61
Konrad Rudolph Avatar answered Oct 05 '22 14:10

Konrad Rudolph


Your rng variable is shared, so the threads are spending all their time waiting to be able to use the random number generator. Give each thread a separate instance of the RNG. This will probably mean making the RNG initialization code run in parallel as well.

like image 29
Greg Inozemtsev Avatar answered Oct 05 '22 14:10

Greg Inozemtsev


Again thanks everyone for helping. I just found out that if I get rid of

schedule(dynamic)

in the code, the problem disapears. But why is that?

like image 23
user1620200 Avatar answered Oct 05 '22 13:10

user1620200