I use a MPI
(mpi4py
) script (on a single node), which works with a very large object. In order to let all processes have access to the object, I distribute it through comm.bcast()
. This copies the object to all processes and consumes a lot of memory, especially during the copying process. Therefore, I would like to share something like a pointer instead of the object itself. I found some features in memoryview
useful to boost work with the object inside a process. Also the object's real memory address is accessible through the memoryview
object string representation and can be distributed like this:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank:
content_pointer = comm.bcast(root = 0)
print(rank, content_pointer)
else:
content = ''.join(['a' for i in range(100000000)]).encode()
mv = memoryview(content)
print(mv)
comm.bcast(str(mv).split()[-1][: -1], root = 0)
This prints:
<memory at 0x7f362a405048>
1 0x7f362a405048
2 0x7f362a405048
...
That's why I believe that there must be a way to reconstitute the object in another process. However, I cannot find a clue in the documentation about how to do it.
In short, my question is: Is it possible to share an object between processes on the same node in mpi4py
?
Here's a simple example of shared memory using MPI, very slightly modified from https://groups.google.com/d/msg/mpi4py/Fme1n9niNwQ/lk3VJ54WAQAJ
You can run it with: mpirun -n 2 python3 shared_memory_test.py
(assuming you saved it as shared_memory_test.py)
from mpi4py import MPI
import numpy as np
comm = MPI.COMM_WORLD
# create a shared array of size 1000 elements of type double
size = 1000
itemsize = MPI.DOUBLE.Get_size()
if comm.Get_rank() == 0:
nbytes = size * itemsize
else:
nbytes = 0
# on rank 0, create the shared block
# on rank 1 get a handle to it (known as a window in MPI speak)
win = MPI.Win.Allocate_shared(nbytes, itemsize, comm=comm)
# create a numpy array whose data points to the shared mem
buf, itemsize = win.Shared_query(0)
assert itemsize == MPI.DOUBLE.Get_size()
ary = np.ndarray(buffer=buf, dtype='d', shape=(size,))
# in process rank 1:
# write the numbers 0.0,1.0,..,4.0 to the first 5 elements of the array
if comm.rank == 1:
ary[:5] = np.arange(5)
# wait in process rank 0 until process 1 has written to the array
comm.Barrier()
# check that the array is actually shared and process 0 can see
# the changes made in the array by process 1
if comm.rank == 0:
print(ary[:10])
Should output this (printed from process rank 0):
[0. 1. 2. 3. 4. 0. 0. 0. 0. 0.]
I don't really know much about mpi4py, but this should not be possible from the MPI point of view. MPI stands for Message Passing Interface, which means exactly that: pass messages around between processes. You could try and use MPI One-sided communication to resemble something like a globally accessible memory, but otherwise process memory is unavailable to other processes.
If you need to rely on a large block of shared Memory, you need to utilize something like OpenMP or threads, which you absolutely could use on a single node. A hybrid parallelization with MPI and some shared memory parallelization would allow you to have one shared memory block per node, but still the option to utilize many nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With