I have a FORTRAN MPI code to solve a flow field.
At the start I want to read data from file and distribute it to the participating processes.
The data is consisting of several 3-D arrays(velocities in space x,y,z).
Every process stores only a part of the array.
So if every process is going to read the file(the easiest way I think) it is not going to work as it will only store a the first part of the file corresponding to the number of arrays that the process can hold.
MPI Bcast can work for 3d arrays? But then things become complex.
Or is there an easier way?
You have, broadly speaking, 2 or 3 choices, depending on your platform.
- One process reads the input data and sends (parts of) it to the other processes. I wouldn't usually use broadcast for this since it is a collective operation and all processes have to take part. I'd usually just send the necessary information to each process. If it is convenient (and not a memory issue) you could certainly broadcast all the input data to all the processes, it's just not a pattern of operation that I use or see much.
- All processes read the data that they require. This may involve a process reading an entire input file and only storing those parts it requires. But if you have very large input files you can write routines to read only the necessary part into each process's memory space. This approach may involve processes competing for disk access, which is only slow in a relative sense: if you are running large-scale and long-running parallel computations waiting a few seconds while all the processes get their data is not much of an overhead.
- If you have a parallel file system then you can use MPI's parallel I/O routines so that each process reads only those parts of the input data that it requires.