I have the following question: What is the real overhead of allocate/deallocate statements in Fortran90+? I.e., several medium-sized arrays are allocated inside a loop, like
do i = 1, 1000
allocate(tmp(20))
tmp(1:20) = 1d0
call foo(tmp)
deallocate(tmp)
end do
Is it worth allocating a single work array based on the maximal size in this case?
I have found that dynamic array allocation within tight loops can really slow down the execution of my code, with valgrind showing that a large percentage of cycles is taken up by malloc
and free
. So if foo
is a very quick function, then it would be worth statically allocating this array. It is easy to see this overhead by profiling using valgrind's callgrind functionality (it may be worth reducing the size of your problem as the profiled execution can be at least 10 times slower).
In fortran 2008 there is a nicer solution to this type of problem. You can declare your variables inside a block
construct with a size determined at run time. This should make it much easier for the compiler to allocate the variable on the stack. However I haven't used this personally and I'm not sure which compilers support it.
The overhead of using ALLOCATE
and DEALLOCATE
is the same as the overhead of using malloc()
and free()
in C. Actually most Fortran compilers implement (DE)ALLOCATE
as wrappers around malloc()/free()
with some added bookkeeping, inherent to all Fortran 90 arrays.
Usually it is better to preallocate a big enough scratch array and use it in tight loops instead of constantly allocating and freeing memory. It also keeps the heap from getting fragmented which could lead to allocation problems later on (very rare situation but it happens, especially with 32-bit codes).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With