Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segmentation fault for array, but only if a component of a derived type

Pretty simple setup, using gfortran 4.8.5 on linux (red hat):

  • I get a segfault if my array of reals (inside a derived type) has size > 2,000,000. This seems to be a standard stack/heap issue as my stack size is 8mb if I check with ulimit.

  • There is no problem if the array is NOT inside a derived type

  • Note that as @francescalus guesses, removing the initial value = 0.0 eliminates the problem

Edit to add: Note that I have posted a followup question Segmentation fault related to component of derived type that represents a more realistic use case and further narrows down the conditions under which this seems to occur.

program main

    call sub1     ! seg fault  if col size >   2,100,000
    call sub2     ! works fine at col size = 100,000,000  

end program main

subroutine sub1

    type table
        real :: col(2100000) = 0.0     ! works if "= 0.0" removed
    end type table

    type(table) :: table1
    table1%col = 1.0

end subroutine sub1

subroutine sub2
    real :: col(100000000) = 0.0
    col = 1.0
end subroutine sub2

Some obvious questions here:

  • Is this expected behavior, or some bug that was fixed in newer versions of gfortran?

  • Am I following standard fortran operating procedures here, or doing something wrong?

  • What is the recommended way to avoid this (please assume that I am unable to update to a newer version of gfortran in the near term)? I will almost certainly solve with an allocatable array component for reasons not specific to this question, but that might not be an ideal general solution and I would like to know of all good options I have here.

  • In particular, is initializing the components of a derived type bad practice?

like image 742
JohnE Avatar asked Aug 23 '18 12:08

JohnE


1 Answers

This is likely to be a runtime issue due to insufficient stack, rather than a bug with gfortran.

Gfortran uses the stack to store automatic arrays and other initialization data. When code does not create problems when one such array is small, but segfaults when the size of the array increases, a possible reason is running out of stack.

The issue seems to be the same in more recent versions of gfortran. I compiled and ran your program with gfortran 4.8.4, 4.9.3, 5.5.0, 6.4.0, 7.3.0 and 8.2.0. In all cases I obtained a segmentation fault with the default stack size, but no error when the stack size was slightly increased.

$  ./sfa
Segmentation fault
$ ulimit -s
8192
$ ulimit -s 8256 
$ ./sfa && echo "DONE"
DONE

Your problem may be solved by running

$ ulimit -s unlimited

before executing your binary. I am not aware of any particular penalty for doing this, but programmers more aware of the fine details of memory management, such as compiler developers, may think otherwise.

Initializing the components of a derived type is not bad practice, but as you can see, it can create problems with the stack if the component is a big array - be it due to the storage of the component itself, or to the storage of memory to work on the RHS of the assignment. If the component is made allocatable and allocated in a subroutine, the array is stored in the heap rather than in the stack, and this issue is usually avoided. In this case, it may be about actually setting the values of the array dynamically in a subroutine rather than at compile time. It may be less elegant, but I think it's worth it, since it's the typical example of code development work that prevents avoidable, environment-related errors when executing the binary.

Your code above is standards compliant. As explained in the comments, lack of explicit interfaces for subroutines is not good practice, but for these simple subroutines it's not against the rules.

Some compilers have flags that allow you to change where some objects are allocated in memory. While it may fix a particular issue, flags are compiler dependent, and usually not equivalent when comparing different compilers. Using dynamic memory via allocatables is a more robust solution, according to my experience.

Finally, note that, if you are using OpenMP, the ulimit command above only affects the master thread - you need to set the stack size of each of the other threads via the environment variable OMP_STACKSIZE, which cannot be unlimited. And bear in mind that non-master threads running out of stack are a problem much more difficult to diagnose, since the binary may stop without a proper Segmentation fault error.

like image 89
jme52 Avatar answered Dec 28 '22 15:12

jme52