Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMP recursive tasks

Consider following Program calculating Fibonacci Numbers.
It uses OpenMP Tasks for parallelisation.

#include <iostream> 
#include <omp.h>

using namespace std;

int fib(int n)
{
    if(n == 0 || n == 1)
        return n;

    int res, a, b;
    #pragma omp parallel
    {
        #pragma omp single 
        {
            #pragma omp task shared(a)
            a = fib(n-1);
            #pragma omp task shared(b)
            b = fib(n-2);
            #pragma omp taskwait
            res = a+b;
        } 

    }
    return res;
  }

int main()
{  
    cout << fib(40);    
}

I use gcc version 4.8.2 and Fedora 20.
When compiling the above program with g++ -fopenmp name_of_program.cpp -Wall and running it, I see when looking into htop that only two (sometimes 3) threads are running. The machine I'm running this program on has 8 logical CPUs. My question is, what do I need to do to offload the work onto 8 Threads. I tried export OMP_NESTED=TRUE, but this leads to following error while running the Program:
libgomp: Thread creation failed: Resource temporarily unavailable
The point of my program is not to efficiently compute Fibonacci Numbers, but to use tasks or something similar in OpenMP.

like image 711
user3457151 Avatar asked Mar 24 '14 21:03

user3457151


1 Answers

With OMP_NESTED=FALSE, a team of threads is assigned to the top-level parallel region, and no extra threads at each nested level, so at most two threads will be doing useful work.

With OMP_NESTED=TRUE, a team of threads is assigned at each level. On your system there are 8 logical CPUs, so the team size is likely 8. The team includes one thread from outside the region, so only 7 new threads are launched. The recursion tree for fib(n) has about fib(n) nodes. (A nice self-referential property of fib!) Thus the code might create 7*fib(n) threads, which can quickly exhaust resources.

The fix is to use a single parallel region around the entire task tree. Move the omp parallel and omp single logic to main, outside of fib. That way a single thread team will work on the entire task tree.

The general point is to distinguish potential parallelism from actual parallelism. The task directives specify potential parallelism, which might or might not actually be used during an execution. An omp parallel (for all practical purposes) specifies actual parallelism. Usually you want the actual parallelism to match the available hardware, so as not to swamp the machine, but have the potential parallelism be much larger, so that the run-time can balance load.

like image 148
Arch D. Robison Avatar answered Oct 25 '22 19:10

Arch D. Robison