Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fortran memory allocation does not give an error, but the program is killed by OS at initialization

Given the minimal working example provided below, do you know why the memory allocation error does not occur at memory allocation step? As I checked, when I use valgrind to run the code, or add parameter source=0.0 to memory allocation statement, then I have, as expected, the memory allocation error.

Update: I reproduced the issue with minimal working example:

 program memory_test

  implicit none

  double precision, dimension(:,:,:,:), allocatable :: sensitivity
  double precision, allocatable :: sa(:)
  double precision, allocatable :: sa2(:)

  integer :: ierr,nnz
  integer :: nx,ny,nz,ndata

  nx = 50
  ny = 50
  nz = 100
  ndata = 1600

  allocate(sensitivity(nx,ny,nz,ndata),stat=ierr)

  sensitivity = 1.0

  nnz = 100000000

  !allocate(sa(nnz),source=dble(0.0),stat=ierr)
  allocate(sa(nnz),stat=ierr)
  if(ierr /= 0) print*, 'Memory error!'

  !allocate(sa2(nnz),source=dble(0.0),stat=ierr)
  allocate(sa2(nnz),stat=ierr)
  if(ierr /= 0) print*, 'Memory error!'

  print*, 'Start initialization'

  sa = 0.0
  sa2 = 0.0

  print*, 'End initialization'

end program memory_test

When I run it I have no message 'Memory error!' printed, but have message 'Start initialization' and then the program is killed by OS. If I use memory allocation with 'source' parameter (as commented in the code) only then I have message 'Memory error!'.

For memory statistics, the 'free' command gives me this output:

             total       used       free     shared    buffers     cached
Mem:       8169952    3630284    4539668      46240       1684     124888
-/+ buffers/cache:    3503712    4666240
Swap:            0          0          0
like image 959
Vitaliy Avatar asked Dec 20 '22 03:12

Vitaliy


2 Answers

You are seeing the behavior of the memory allocation strategy linux uses. When you allocate memory but have not written to it, it is solely contained in virtual memory (note this may also be affected by the particular Fortran runtime library, but I'm not sure). This memory exists in your process virtual address space but it is not backed by any actual physical memory pages. Only when you write to the memory will physical pages be allocated and only enough to satisfy the write.

Consider the following program:

program test
   implicit none
   real,allocatable :: array(:) 

   allocate(array(1000000000)) !4 gb array

   print *,'Check memory - 4 GB allocated'
   read *

   array(1:1000000) = 1.0

   print *,'Check memory - 4 MB assigned'
   read *

   array(1000000:100000000) = 2.0

   print *,'Check memory - 400 MB assigned'
   read *

   array = 5.0

   print *,'Check memory - 4 GB assigned'
   read *

end program

This program allocates 4 GB of memory then writes to a 4 MB array section, a 396 MB array section (total writes = 400 MB) and finally writes to the full array (total writes = 4 GB). The program pauses between each write so you can take a look at memory usage.

After the allocate, before the first write:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                           
29192 casey     20   0 3921188   1176   1052 S   0.0  0.0   0:00.00 fortranalloc

All of the memory is virtual (VIRT), only a small bit is backed by physical memorory (RES).

After the 4 MB write:

29192 casey     20   0 3921188   5992   1984 S   0.0  0.0   0:00.00 fortranalloc

after the 396 MB write:

29192 casey     20   0 3921188 392752   1984 S   0.0  1.6   0:00.18 fortranalloc

and after the 4 GB write:

29192 casey     20   0 3921188 3.727g   1984 S  56.6 15.8   0:01.88 fortranalloc 

Note that after each write the resident memory increases to satisfy the write. This shows you that actual physical memory allocation only occurs on write, not merely on allocation, thus the normal allocate() has no way to detect error. When you add the source parameter to allocate then a write occurs and this causes full physical allocation of the memory and this if this fails, the error can be detected.

What you are likely seeing is the linux OOM Killer which is invoked when memory is exhausted. When this occurs the OOM Killer will use an algorithm to determine what to kill to free up memory, and the behavior of your code makes it a very likely candidate to be killed. When your write causes physical allocation that can be met, your process is being killed by the kernel. You see it on write (caused by assignment) but not allocation because of the behavior detailed above.

like image 89
casey Avatar answered Jan 18 '23 23:01

casey


Extended comment rather than an answer:

In Fortran initialization has a specific meaning; it refers to setting a variable's value at declaration. So this

real :: myvar = 0.0

is initialization. While these

real :: myvar
....
myvar = 0.0

are not. Now, perhaps more relevant to the issue you report, this statement

isensit%sa(:) = 0.0

assigns the value 0.0 to every element of the array section isensit%sa(:). This is very (once you get used to it) different to what I think you meant to write, which is:

isensit%sa = 0.0

This version assigns the value 0.0 to every element of the array isensit%sa. Because an array section, even one comprising every element of the array, is not the array, Fortran compiler's may temporarily allocate space for the section while it processes the assignment. This probably makes sense when you think about a more general array section.

I'm not sure I understand why you think the space isn't allocated when the allocate statement executes but I suggest you sort out the assignment, then think again. And I guess that the temporary allocation of space for the array section, which will be as much space as the array itself consumes, might tip your program over the edge and cause the behaviour you report.

Incidentally, you might try the statement

allocate(isensit%sa(isensit%nnz),source=0.0,stat=ierr)

which should, if your compiler is bang up to date, do the allocation and set the values in the array in one statement.

Oh, and an entirely gratuitous remark: prefer use mpi (or use mpi_mod or whatever your installation prefers to include mpif.h. This will forestall (many) errors which might arise from mismatching calls to mpi routines with their requirements. Use-association of the routines means that the compiler can check argument matching, inclusion of a header file does not.

like image 24
High Performance Mark Avatar answered Jan 19 '23 00:01

High Performance Mark