Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fortran OpenMP where will the array be allocated

I have a question about Fortran-OpenMP and allocatable arrays. It's simple: Where will the space be allocated? If I have something like

!$omp parallel default(shared) private(arr)
!$omp critical
  allocate( arr(BIGNUMBER) )
!$omp end critical

!do calculations with many arr accesses

!$omp critical
  deallocate( arr )
!$omp end critical
!$omp end parallel

will the space be allocated on the stack, or the heap? If it's on the heap, is there a difference between the code above and something like this

allocate( arr(BIGNUMBER, nThread) )
!$omp parallel default(shared) private(localArr)
  iThread = omp_get_thread_num()

  localArr => arr(:, iThread)

  !do calculations with many localArr accesses
!$omp end parallel

deallocate( arr )
  • In the first code, there are two critical regions. I would assume, that they would slow the execution down and not scale very well. (I'm actually not sure if I could just leave them out, because the allocate is thread-save?) But if the array were allocated on the stack, then it should be faster, because of faster access.
  • In the second code I am sure to have the array on the heap, which is slower access. But if the array in the first code is allocated on the heap as well, then I'll save the critical reagions + it's only one allocate/deallocate. Should be faster?
  • Does the size of the array play any roll in this?
  • If it were to be allocated on the heap, is there a way to force an allocation on the stack?

The short question is basically: Which would seem to be the optimal solution for the problem?

like image 880
Cabadath Avatar asked Jan 27 '26 18:01

Cabadath


2 Answers

Fortran compilers with OpenMP tend to allocate automatic variable (including arrays) on the stack. When you do explicit allocation I they will be allocated on the heap, but note that Fortran standard does not speak about stack or heap at all, it's up to the compiler. In Ex. number 1 I would leave the critical sections out, because you are allocating private variables. Regarding to the size, there are sometimes stack overflows due to too large automatic arrays, but this is probably not your case. What is the fastest approach I don't know.

This program allocates arrays on the heap in my compiler

integer,parameter :: BIGNUMBER = 100000000
real,dimension(:),allocatable :: arr

allocate( arr(BIGNUMBER) )

!$omp parallel default(shared) private(Arr)
  iThread = omp_get_thread_num()

  arr = 5
  
  print *, arr

!$omp end parallel
deallocate( arr )


end

and this one on the stack (and then it crashes)

integer,parameter :: BIGNUMBER = 100000000
real arr(BIGNUMBER)

!$omp parallel default(shared) private(Arr)
  iThread = omp_get_thread_num()

  arr = 5
  
  print *, arr

!$omp end parallel


end
like image 184
Vladimir F Героям слава Avatar answered Jan 31 '26 03:01

Vladimir F Героям слава


OK, Vladimir says most of what I would have said (Not mentioned in the standard, it's totally up to the implementation, why are you using criticals to protect your privates?)

But ... you give the impression that you think that access to memory allocated on the stack is somehow faster than that on the heap. On any typical implementation this is not the case - the access time is the same. Allocation of memory on the stack is usually quicker than on the heap, but once it is allocated the time to access it is the same. So I would cut the criticals and go route 1 - it's simpler, keeping things private is good, pointers are bad, and if memory allocation time is your limiting step then you've almost certainly not got enough work in the parallel region to make parallelising it worthwhile.

like image 45
Ian Bush Avatar answered Jan 31 '26 04:01

Ian Bush