I have a "masked array" that I would like to add to another array -- In other words, I have 3 arrays, A
, B
and mask
. My question is what is the most efficient (in terms of execution time) way to store the mask (as a logical array, as a real array of ones and zeros)?
EDIT
Here's a toy program you can play around with (if you have mpif77):
program main
implicit None
include 'mpif.h'
integer, parameter :: ntry=10000
integer, parameter :: asize=1000000
real,dimension(asize) :: A,B,maskr
logical,dimension(asize) :: mask
real*8 :: dd,dt,dtave,dtbest
integer i
do i=1,asize
maskr(i)=mod(i,2)
mask(i)=.False.
if(mod(i,2).eq.0) mask(i)=.True.
enddo
A=1.0; B=1.0
dtbest=1d33
dtave=0.0
do i=1,ntry
dt=mpi_wtime()
call add_arrays_logical(asize,A,B,mask)
dt=mpi_wtime()-dt
dtbest=min(dt,dtbest)
dtave=dtave+dt
enddo
print*,"==== logical ==="
print*,"Average",dtave/ntry
print*,"Best",dtbest
A=1.0; B=1.0
dtbest=1d33
dtave=0.0
do i=1,ntry
dt=mpi_wtime()
call add_arrays_real(asize,A,B,maskr)
dt=mpi_wtime()-dt
dtbest=min(dt,dtbest)
dtave=dtave+dt
enddo
print*,"==== Real ==="
print*,"Average",dtave/ntry
print*,"Best",dtbest
A=1.0; B=1.0
dtbest=1d33
dtave=0.0
do i=1,ntry
dt=mpi_wtime()
where(mask) A=A+B
dt=mpi_wtime()-dt
dtbest=min(dt,dtbest)
dtave=dtave+dt
enddo
print*,"==== Where ===="
print*,"Average",dtave/ntry
print*,"Best",dtbest
end
subroutine add_arrays_logical(n,A,B,mask)
integer n
real A(n),B(n)
logical mask(n)
do i=1,n
if(mask(i))then
A(i)=A(i)+B(i)
endif
enddo
end
subroutine add_arrays_real(n,A,B,mask)
integer n
real A(n),B(n),mask(n)
do i=1,n
A(i)=A(i)+mask(i)*B(i)
enddo
end
My results:
(gfortran -O2)
==== logical ===
Average 1.52590200901031483E-003
Best 1.48987770080566406E-003
==== Real ===
Average 1.78022863864898680E-003
Best 1.74498558044433594E-003
==== Where ====
Average 1.48216445446014400E-003
Best 1.44505500793457031E-003
(gfortran -O3 -funroll-loops -ffast-math)
==== logical ===
Average 1.47997992038726811E-003
Best 1.44982337951660156E-003
==== Real ===
Average 1.40655457973480223E-003
Best 1.37186050415039063E-003
==== Where ====
Average 1.48403010368347165E-003
Best 1.45006179809570313E-003
(pfg90 -fast) -- on a very old machine
==== logical ===
Average 5.4871437072753909E-003
Best 5.4519176483154297E-003
==== Real ===
Average 4.6096980571746831E-003
Best 4.5847892761230469E-003
==== Where ====
Average 5.3572671413421634E-003
Best 5.3288936614990234E-003
(pfg90 -O2) -- on a very old machine
==== logical ===
Average 5.4929971456527714E-003
Best 5.4569244384765625E-003
==== Real ===
Average 5.5974062204360965E-003
Best 5.5701732635498047E-003
==== Where ====
Average 5.3811835527420044E-003
Best 5.3341388702392578E-003
Of course, there are a few things that could influence this -- the compilers ability to vectorize the loops for instance -- so is there a rule of thumb about how something like this should be achieved?
Why not use "where"?
where (mask) A = A + B
Probably using the mask is fastest but the only way to know for sure is to measure.
If by flops you mean floating point operations then the first option is obviously better since in that case you have 1 flop per loop iteration where mask(n) == .true. . Whereas for the second option you have 2 flops per loop iteration regardless of the value of mask(n).
OTOH, if you're interested in minimizing the time spent executing this function, why don't you try both versions on your data and test which is faster?
You might also want to test a version where you use the Fortran 90+ WHERE construct
where(mask) A = A + B
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With