Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallelizing fortran 2008 `do concurrent` systematically, possibly with openmp

The fortran 2008 do concurrent construct is a do loop that tells the compiler that no iteration affect any other. It can thus be parallelized safely.

A valid example:

program main
  implicit none
  integer :: i
  integer, dimension(10) :: array
  do concurrent( i= 1: 10)
    array(i) = i
  end do
end program main

where iterations can be done in any order. You can read more about it here.

To my knowledge, gfortran does not automatically parallelize these do concurrent loops, while I remember a gfortran-diffusion-list mail about doing it (here). It justs transform them to classical do loops.

My question: Do you know a way to systematically parallelize do concurrent loops? For instance with a systematic openmp syntax?

like image 855
max Avatar asked Jul 18 '12 21:07

max


People also ask

Does Fortran do concurrent?

With the DO CONCURRENT construct, you can specify that individual loop iterations have no interdependencies. The execution order of the iterations can be indeterminate at the beginning of execution of the DO CONCURRENT construct.

What is the use of pragma omp?

Purpose. The omp for directive instructs the compiler to distribute loop iterations within the team of threads that encounters this work-sharing construct.

What is omp parallel?

The omp parallel directive explicitly instructs the compiler to parallelize the chosen block of code.

What is thread in OpenMP?

When run, an OpenMP program will use one thread (in the sequential sections), and several threads (in the parallel sections). There is one thread that runs from the beginning to the end, and it's called the master thread. The parallel sections of the program will cause additional threads to fork.


2 Answers

It is not that easy to do it automatically. The DO CONCURRENT construct has a forall-header which means that it could accept multiple loops, index variables definition and a mask. Basically, you need to replace:

DO CONCURRENT([<type-spec> :: ]<forall-triplet-spec 1>, <forall-triplet-spec 2>, ...[, <scalar-mask-expression>])
  <block>
END DO

with:

[BLOCK
    <type-spec> :: <indexes>]

!$omp parallel do
DO <forall-triplet-spec 1>
  DO <forall-triplet-spec 2>
    ...
    [IF (<scalar-mask-expression>) THEN]
      <block>
    [END IF]
    ...
  END DO
END DO
!$omp end parallel do

[END BLOCK]

(things in square brackets are optional, based on the presence of the corresponding parts in the forall-header)

Note that this would not be as effective as parallelising one big loop with <iters 1>*<iters 2>*... independent iterations which is what DO CONCURRENT is expected to do. Note also that forall-header permits a type-spec that allows one to define loop indexes inside the header and you will need to surround the whole thing in BLOCK ... END BLOCK construct to preserve the semantics. You would also need to check if scalar-mask-expr exists at the end of the forall-header and if it does you should also put that IF ... END IF inside the innermost loop.

If you only have array assignments inside the body of the DO CONCURRENT you would could also transform it into FORALL and use the workshare OpenMP directive. It would be much easier than the above.

DO CONCURRENT <forall-header>
  <block>
END DO

would become:

!$omp parallel workshare
FORALL <forall-header>
  <block>
END FORALL
!$omp end parallel workshare

Given all the above, the only systematic way that I can think about is to systematically go through your source code, searching for DO CONCURRENT and systematically replacing it with one of the above transformed constructs based on the content of the forall-header and the loop body.

Edit: Usage of OpenMP workshare directive is currently discouraged. It turns out that at least Intel Fortran Compiler and GCC serialise FORALL statements and constructs inside OpenMP workshare directives by surrounding them with OpenMP single directive during compilation which brings no speedup whatsoever. Other compilers might implement it differently but it's better to avoid its usage if portable performance is to be achieved.

like image 195
Hristo Iliev Avatar answered Sep 19 '22 14:09

Hristo Iliev


I'm not sure what you mean "a way to systematically parallelize do concurrent loops". However, to simply parallelise an ordinary do loop with OpenMP you could just use something like:

!$omp parallel private (i)
!$omp do
do i = 1,10
    array(i) = i
end do
!$omp end do
!$omp end parallel

Is this what you are after?

like image 43
Chris Avatar answered Sep 18 '22 14:09

Chris