Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alignment of multi-dimensional array for omp simd

If I understand the aligned clause of the omp simd construct, it refers to the alignment of the whole array.

How is it used for multi-dimensional arrays? Assume

ni = 131; nj = 137; nk = 127

!allocates  arr(1:131,1:137,1:127) aligned to 64-bytes
call somehow_allocate_aligned(arr, [ni,nj,nk], 64)

!$omp parallel do collapse(2)
do k = 1, nk
  do j = 1, nj

    call some_complicated_subroutine(arr(:,j,k))

    !$omp simd aligned(arr:64)
    do i = 1, ni
      arr(i,j,k) = some arithmetic expression involving arr(i,j,k)
    end do
  end do
end do
!$omp end parallel do

Is this the correct way to indicate the alignment of the array although the iteration of the inner loop starts at arr(1,j,k)?

How does the compiler use that information to infer anything about the alignment of the inner loop subarray?

Does it matter for the performance if the run-time sizes are nicer (say 128, 128, 128)?

like image 584
Vladimir F Героям слава Avatar asked Oct 27 '15 15:10

Vladimir F Героям слава


1 Answers

It is explained here, slides 160-165 : http://irpf90.ups-tlse.fr/files/parallel_programming.pdf

You should

1) Align the array

2) use padding to force all your columns to be aligned : Your first dimension (specified in the allocate statement) should be a multiple of the number of elements to reach the 16, 32 or 64 -byte boundary depending on the instruction set.

For example, for a 99x29x200 matrix with the AVX instruction set (32 bytes alignment) in double precision (8 bytes/element), you should do

n = 99
l = 29
m=200

delta_n = mod(n,32/8)
if (delta_n == 0) then
  n_pad = n
else
  n_pad = n-delta_n+32/8
end if

allocate( A(n_pad,l,m) )
!DIR$ ATTRIBUTES ALIGN : 32 :: A

do k=1,m
  do j=1,l
    !$OMP SIMD
    do i=1,n
      A(i,j,k) = ...
    end do
  end do
end do

You can use the C preprocessor to make portable code replacing the 32 and 8 in the previous example.

Note : be careful using statements such as B=A for arrays, as the physical dimensions will not correspond to the logical dimensions. Good practice is to set the boundaries as B(1:n,1:l,1:m) = A(1:n,1:l,1:m) as it will still work if you change the physical dimensions.

like image 177
Anthony Scemama Avatar answered Oct 09 '22 13:10

Anthony Scemama