Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMP: Get total number of running threads

I need to know the total number of threads that my application has spawned via OpenMP. Unfortunately, the omp_get_num_threads() function does not work here since it only yields the number of threads in the current team.

However, my code runs recursively (divide and conquer, basically) and I want to spawn new threads as long as there are still idle processors, but no more.

Is there a way to get around the limitations of omp_get_num_threads and get the total number of running threads?

If more detail is required, consider the following pseudo-code that models my workflow quite closely:

function divide_and_conquer(Job job, int total_num_threads):
  if job.is_leaf(): # Recurrence base case.
    job.process()
    return

  left, right = job.divide()

  current_num_threads = omp_get_num_threads()
  if current_num_threads < total_num_threads: # (1)
    #pragma omp parallel num_threads(2)
      #pragma omp section
        divide_and_conquer(left, total_num_threads)
      #pragma omp section
        divide_and_conquer(right, total_num_threads)

  else:
    divide_and_conquer(left, total_num_threads)
    divide_and_conquer(right, total_num_threads)

  job = merge(left, right)

If I call this code with a total_num_threads value of 4, the conditional annotated with (1) will always evaluate to true (because each thread team will contain at most two threads) and thus the code will always spawn two new threads, no matter how many threads are already running at a higher level.

I am searching for a platform-independent way of determining the total number of threads that are currently running in my application.

like image 571
Konrad Rudolph Avatar asked Jan 16 '11 16:01

Konrad Rudolph


People also ask

How do I count the number of threads in OpenMP?

omp_get_num_threads() The omp_get_num_threads function returns the number of threads in the team currently executing the parallel region from which it is called. The function binds to the closest enclosing PARALLEL directive.

How many threads can I use in OpenMP?

First of all,you can't run more than 8 threads. Second,resort to nested parallelism if nothing else works as openmp has to improve a lot in this aspect.

What is returned by Omp_get_thread_num () function?

omp_get_thread_num() The omp_get_thread_num function returns the number of the currently executing thread within the team.

What is OMP_NUM_THREADS 1?

OMP_NUM_THREADS. Sets the maximum number of threads in the parallel region, unless overridden by omp_set_num_threads or num_threads. OMP_DYNAMIC. Specifies whether the OpenMP run time can adjust the number of threads in a parallel region.


2 Answers

I think there isn't any such routine in at least OpenMP 3; and if there was, I'm not sure it would help, as there's obviously a huge race condition in between the counting of the number of threads and the forking. You could end up overshooting your target number of threads by almost a factor of 2 if everyone sees that there's room for one thread left and then everyone spawns a thread.

If this really is the structure of your program, though, and you just want to limit the total number of threads, there are options (all of these are OpenMP 3.0):

  1. Use the OMP_THREAD_LIMIT environment variable to limit the total number of OpenMP threads
  2. Use OMP_MAX_ACTIVE_LEVELS, or omp_set_max_active_levels(), or test against omp_get_level(), to limit how deeply nested your threads are; if you only want 16 threads, limit to 4 levels of nesting
  3. If you want finer control than powers of two, you can use omp_get_level() to find your level, and call omp_get_ancestor_thread_num(int level) at various levels to find out which thread was your parent, grandparent, etc and from that (using this simple left-right forking) determine a global thread ID. (I think in this case it would go something like ∑l=0..L-1 al 2L-l where l is the level number starting at 0 and a is the ancestor thread number at that level). This would let you (say) allow threads 0-3 to fork but not 4-7, so that you'd end up with 12 rather than 16 threads. I think this only works in such a regular situation; if each parent thread forked a different number of child threads, I don't think you could determine a unique global thread ID because it looks like you can only query your direct ancestors.
like image 112
Jonathan Dursi Avatar answered Oct 06 '22 00:10

Jonathan Dursi


The code you have shown has a problem in that an "omp section" has to be within the lexical scope of an "omp sections". I am assuming that you meant the "omp parallel" to be an "omp parallel sections". The other way to do this, is to use "omp task" and then you don't have to keep count of the number of threads. You would just assign the threads to the parallel region and allow the OpenMP implementation to assign the tasks to the threads.

like image 25
ejd Avatar answered Oct 06 '22 01:10

ejd