I have two functions, do_step_one(i)
and do_step_two(i)
, for i
from 0
to N-1
.
Currently, I have this (sequential) code:
for(unsigned int i=0; i<N; i++) {
do_step_one(i);
}
for(unsigned int i=0; i<N; i++) {
do_step_two(i);
}
Each call of do_step_one()
and do_step2()
can be done in any order and in parallel, but any do_step_two()
needs the end of all the do_step_one()
to start (it use do_step_one()
results).
I tried the following :
#omp parallel for
for(unsigned int i=0; i<N; i++) {
do_step_one(i);
#omp barrier
do_step_two(i);
}
But gcc complains
convolve_slices.c:21: warning: barrier region may not be closely nested inside of work-sharing, critical, ordered, master or explicit task region.
What do I misunderstand? How to solve that issue?
Yes, "There is an implicit barrier at the end of the parallel construct."
When run, an OpenMP program will use one thread (in the sequential sections), and several threads (in the parallel sections). There is one thread that runs from the beginning to the end, and it's called the master thread. The parallel sections of the program will cause additional threads to fork.
#pragma omp parallel spawns a group of threads, while #pragma omp for divides loop iterations between the spawned threads. You can do both things at once with the fused #pragma omp parallel for directive.
Just a side note, if you want to make sure the threads are not recreated, separate the declaration of parallel and declaration of for:
#pragma omp parallel
{
#pragma omp for
for(unsigned int i=0; i<N; i++){
do_step_one(i);
}
//implicit barrier here
#pragma omp for
for(unsigned int i=0; i<N; i++){
do_step_two(i);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With