I'm currently using gfortran 4.9.2 and I was wondering if the compiler actually know hows to take advantage of the DO CONCURRENT construct (Fortran 2008). I know that the compiler "supports" it, but it is not clear what that entails. For example, if automatic parallelization is turned on (with some number of threads specified), does the compiler know how to parallelize a do concurrent loop?
Edit: As mentioned in the comment, this previous question on SO is very similar to mine, but it is from 2012, and only very recent versions of gfortran have implemented the newest features of modern Fortran, so I thought it was worth asking about the current state of the compiler in 2015.
With the DO CONCURRENT construct, you can specify that individual loop iterations have no interdependencies. The execution order of the iterations can be indeterminate at the beginning of execution of the DO CONCURRENT construct.
utilize do concurrent including reductions. Can Fortran's do concurrent replace directives for accelerated computing? answer is YES, and with no (or minimal) loss of performance.
Gfortran is the name of the GNU Fortran project, developing a free Fortran 95/2003/2008/2018 compiler for GCC, the GNU Compiler Collection.
Since GCC 11, OpenMP 4.5 is fully supported for Fortran and OpenMP 5.0 support has been extended for C, C++ and Fortran.
Rather than explicitly enabling some new functionality, DO CONCURRENT
in gfortran seems to put restrictions on the programmer in order to implicitly allow parallelization of the loop when required (using the option -ftree-parallelize-loops=NPROC
).
While a DO
loop can contain any function call, the content of DO CONCURRENT
is restricted to PURE
functions (i.e., having no side effects). So when one attempts to use, e.g., RANDOM_NUMBER
(which is not PURE
as it needs to maintain the state of the generator) in DO CONCURRENT
, gfortran will protest:
prog.f90:25:29:
25 | call random_number(x)
| 1
Error: Subroutine call to intrinsic ‘random_number’ in DO CONCURRENT block at (1) is not PURE
Otherwise, DO CONCURRENT
behaves as normal DO
. It only enforces use of parallelizable code, so that -ftree-parallelize-loops=NPROC
succeeds. For instance, with gfortran 9.1 and -fopenmp -Ofast -ftree-parallelize-loops=4
, both the standard DO
and the F08 DO CONCURRENT
loops in the following program run in 4 threads and with virtually identical timing:
program test_do
use omp_lib, only: omp_get_wtime
integer, parameter :: n = 1000000, m = 10000
real, allocatable :: q(:)
integer :: i
real :: x, t0
allocate(q(n))
t0 = omp_get_wtime()
do i = 1, n
q(i) = i
do j = 1, m
q(i) = 0.5 * (q(i) + i / q(i))
end do
end do
print *, omp_get_wtime() - t0
t0 = omp_get_wtime()
do concurrent (i = 1:n)
q(i) = i
do j = 1, m
q(i) = 0.5 * (q(i) + i / q(i))
end do
end do
print *, omp_get_wtime() - t0
end program test_do
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With