I am starting to use MPI-IO and tried to write a very simple example of the things I'd like to do with it; however, even though it is a simple code and I took some inspiration from examples I read here and there, I get a segmentation fault I do not understand.
The logic of the piece of code is very simple: each thread will handle a local array which is part of a globlal array I want to write. I create a subarray type using MPI_Type_Create_Subarray
to do so. Then I just open the file, set a view and try to write the data. I get the segmentation fault during the MPI_File_Write_All
.
Here is the code:
program test
implicit none
include "mpif.h"
integer :: myrank, nproc, fhandle, ierr
integer :: xpos, ypos
integer, parameter :: loc_x=10, loc_y=10
integer :: loc_dim
integer :: nx=2, ny=2
real(8), dimension(loc_x, loc_y) :: data
integer :: written_arr
integer, dimension(2) :: wa_size, wa_subsize, wa_start
call MPI_Init(ierr)
call MPI_Comm_Rank(MPI_COMM_WORLD, myrank, ierr)
call MPI_Comm_Size(MPI_COMM_WORLD, nproc, ierr)
xpos = mod(myrank, nx)
ypos = mod(myrank/nx, ny)
data = myrank
loc_dim = loc_x*loc_y
wa_size = (/ nx*loc_x, ny*loc_y /)
wa_subsize = (/ loc_x, loc_y /)
wa_start = (/ xpos, ypos /)*wa_subsize
call MPI_Type_Create_Subarray(2, wa_size, wa_subsize, wa_start &
, MPI_ORDER_FORTRAN, MPI_DOUBLE_PRECISION, written_arr, ierr)
call MPI_Type_Commit(written_arr, ierr)
call MPI_File_Open(MPI_COMM_WORLD, "file.dat" &
& , MPI_MODE_WRONLY + MPI_MODE_CREATE, MPI_INFO_NULL, fhandle, ierr)
call MPI_File_Set_View(fhandle, 0, MPI_DOUBLE_PRECISION, written_arr &
, "native", MPI_INFO_NULL, ierr)
call MPI_File_Write_All(fhandle, data, loc_dim, MPI_DOUBLE_PRECISION &
, MPI_INFO_NULL, ierr)
call MPI_File_Close(fhandle, ierr)
call MPI_Finalize(ierr)
end program test
Any help would be highly appreciated!
The last argument to MPI_FILE_WRITE_ALL
before the error output argument is an MPI status object and not an MPI info object. Making the call with MPI_INFO_NULL
is therefore erroneous. If you are not interested in the status of the write operation then you should pass MPI_STATUS_IGNORE
instead. Making the call with MPI_INFO_NULL
might work in some MPI implementations because of the specifics of how both constants are defined, but then fail in others.
For example, in Open MPI MPI_INFO_NULL
is declared as:
parameter (MPI_INFO_NULL=0)
When passed instead of MPI_STATUS_IGNORE
it causes the C implementation of MPI_File_write_all
to be called with the status argument pointing to a constant (read-only) memory location that holds the value of MPI_INFO_NULL
(that how Fortran implements passing constants by address). When the C function is about to finish, it tries to fill the status object, which results in an attempt to write to the constant memory and ultimately leads to the segmentation fault.
When writing new Fortran programs it is advisable to not use the very old mpif.h
interface as it does not provide any error checking. Rather one should use the mpi
module or even mpi_f08
when more MPI implementations become MPI-3.0 compliant. The beginning of your program should therefore look like:
program test
use mpi
implicit none
...
end program test
Once you use the mpi
module instead of mpif.h
, the compiler is able to perform parameter type checking for some MPI calls, including MPI_FILE_SET_VIEW
, and spot an error:
test.f90(34): error #6285: There is no matching specific subroutine for this generic subroutine call. [MPI_FILE_SET_VIEW]
call MPI_File_Set_View(fhandle, 0, MPI_DOUBLE_PRECISION, written_arr &
-------^
compilation aborted for test.f90 (code 1)
The reason is that the second argument to MPI_FILE_SET_VIEW
is of type INTEGER(KIND=MPI_OFFSET_KIND)
, which is 64-bit on most modern platforms. The constant 0
is simply of type INTEGER
and is therefore 32-bit on most platforms. What happens is that with mpif.h
the compiler passes a pointer to an INTEGER
constant with value of 0
, but the subroutine interprets this as a pointer to a larger integer and interprets the neighbouring values as part of the constant value. Thus the zero that you pass as an offset inside the file ends up being a non-zero value.
Replace the 0
in the MPI_FILE_SET_VIEW
call with 0_MPI_OFFSET_KIND
or declare a constant of type INTEGER(KIND=MPI_OFFSET_KIND)
and a value of zero and then pass it.
call MPI_File_Set_View(fhandle, 0_MPI_OFFSET_KIND, MPI_DOUBLE_PRECISION, ...
or
integer(kind=MPI_OFFSET_KIND), parameter :: zero_off = 0
...
call MPI_File_Set_View(fhandle, zero_off, MPI_DOUBLE_PRECISION, ...
Both methods lead to an output file of size 3200 bytes (as expected).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With