Given the minimal working example provided below, do you know why the memory allocation error does not occur at memory allocation step? As I checked, when I use valgrind to run the code, or add parameter source=0.0 to memory allocation statement, then I have, as expected, the memory allocation error.
Update: I reproduced the issue with minimal working example:
program memory_test
implicit none
double precision, dimension(:,:,:,:), allocatable :: sensitivity
double precision, allocatable :: sa(:)
double precision, allocatable :: sa2(:)
integer :: ierr,nnz
integer :: nx,ny,nz,ndata
nx = 50
ny = 50
nz = 100
ndata = 1600
allocate(sensitivity(nx,ny,nz,ndata),stat=ierr)
sensitivity = 1.0
nnz = 100000000
!allocate(sa(nnz),source=dble(0.0),stat=ierr)
allocate(sa(nnz),stat=ierr)
if(ierr /= 0) print*, 'Memory error!'
!allocate(sa2(nnz),source=dble(0.0),stat=ierr)
allocate(sa2(nnz),stat=ierr)
if(ierr /= 0) print*, 'Memory error!'
print*, 'Start initialization'
sa = 0.0
sa2 = 0.0
print*, 'End initialization'
end program memory_test
When I run it I have no message 'Memory error!' printed, but have message 'Start initialization' and then the program is killed by OS. If I use memory allocation with 'source' parameter (as commented in the code) only then I have message 'Memory error!'.
For memory statistics, the 'free' command gives me this output:
total used free shared buffers cached
Mem: 8169952 3630284 4539668 46240 1684 124888
-/+ buffers/cache: 3503712 4666240
Swap: 0 0 0
You are seeing the behavior of the memory allocation strategy linux uses. When you allocate memory but have not written to it, it is solely contained in virtual memory (note this may also be affected by the particular Fortran runtime library, but I'm not sure). This memory exists in your process virtual address space but it is not backed by any actual physical memory pages. Only when you write to the memory will physical pages be allocated and only enough to satisfy the write.
Consider the following program:
program test
implicit none
real,allocatable :: array(:)
allocate(array(1000000000)) !4 gb array
print *,'Check memory - 4 GB allocated'
read *
array(1:1000000) = 1.0
print *,'Check memory - 4 MB assigned'
read *
array(1000000:100000000) = 2.0
print *,'Check memory - 400 MB assigned'
read *
array = 5.0
print *,'Check memory - 4 GB assigned'
read *
end program
This program allocates 4 GB of memory then writes to a 4 MB array section, a 396 MB array section (total writes = 400 MB) and finally writes to the full array (total writes = 4 GB). The program pauses between each write so you can take a look at memory usage.
After the allocate, before the first write:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29192 casey 20 0 3921188 1176 1052 S 0.0 0.0 0:00.00 fortranalloc
All of the memory is virtual (VIRT), only a small bit is backed by physical memorory (RES).
After the 4 MB write:
29192 casey 20 0 3921188 5992 1984 S 0.0 0.0 0:00.00 fortranalloc
after the 396 MB write:
29192 casey 20 0 3921188 392752 1984 S 0.0 1.6 0:00.18 fortranalloc
and after the 4 GB write:
29192 casey 20 0 3921188 3.727g 1984 S 56.6 15.8 0:01.88 fortranalloc
Note that after each write the resident memory increases to satisfy the write. This shows you that actual physical memory allocation only occurs on write, not merely on allocation, thus the normal allocate()
has no way to detect error. When you add the source
parameter to allocate
then a write occurs and this causes full physical allocation of the memory and this if this fails, the error can be detected.
What you are likely seeing is the linux OOM Killer which is invoked when memory is exhausted. When this occurs the OOM Killer will use an algorithm to determine what to kill to free up memory, and the behavior of your code makes it a very likely candidate to be killed. When your write causes physical allocation that can be met, your process is being killed by the kernel. You see it on write (caused by assignment) but not allocation because of the behavior detailed above.
Extended comment rather than an answer:
In Fortran initialization has a specific meaning; it refers to setting a variable's value at declaration. So this
real :: myvar = 0.0
is initialization. While these
real :: myvar
....
myvar = 0.0
are not. Now, perhaps more relevant to the issue you report, this statement
isensit%sa(:) = 0.0
assigns the value 0.0
to every element of the array section isensit%sa(:)
. This is very (once you get used to it) different to what I think you meant to write, which is:
isensit%sa = 0.0
This version assigns the value 0.0
to every element of the array isensit%sa
. Because an array section, even one comprising every element of the array, is not the array, Fortran compiler's may temporarily allocate space for the section while it processes the assignment. This probably makes sense when you think about a more general array section.
I'm not sure I understand why you think the space isn't allocated when the allocate
statement executes but I suggest you sort out the assignment, then think again. And I guess that the temporary allocation of space for the array section, which will be as much space as the array itself consumes, might tip your program over the edge and cause the behaviour you report.
Incidentally, you might try the statement
allocate(isensit%sa(isensit%nnz),source=0.0,stat=ierr)
which should, if your compiler is bang up to date, do the allocation and set the values in the array in one statement.
Oh, and an entirely gratuitous remark: prefer use mpi
(or use mpi_mod
or whatever your installation prefers to include mpif.h
. This will forestall (many) errors which might arise from mismatching calls to mpi routines with their requirements. Use-association of the routines means that the compiler can check argument matching, inclusion of a header file does not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With