I haven't worked with distributed computing before, but I'm trying to integrate mpi4py into a program in order to parallelize a for loop on a compute cluster.
This is a pseudocode of what I want to do:
for file in directory:
Initialize a class
Run class methods
Conglomerate results
I've looked all over stack overflow and I can't find any solution to this. Is there any way to do this simply with mpi4py, or is there another tool that can do it with easy installation and setup?
In order to achieve parallelism of a for loop with MPI4Py check the code example below. Its just a for loop to add some numbers. The for loop will execute in every node. Every node will get a different chunk of data to work with (range in for loop). In the end Node with rank zero will add the results from all the nodes.
#!/usr/bin/python
import numpy
from mpi4py import MPI
import time
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
a = 1
b = 1000000
perrank = b//size
summ = numpy.zeros(1)
comm.Barrier()
start_time = time.time()
temp = 0
for i in range(a + rank*perrank, a + (rank+1)*perrank):
temp = temp + i
summ[0] = temp
if rank == 0:
total = numpy.zeros(1)
else:
total = None
comm.Barrier()
#collect the partial results and add to the total sum
comm.Reduce(summ, total, op=MPI.SUM, root=0)
stop_time = time.time()
if rank == 0:
#add the rest numbers to 1 000 000
for i in range(a + (size)*perrank, b+1):
total[0] = total[0] + i
print ("The sum of numbers from 1 to 1 000 000: ", int(total[0]))
print ("time spent with ", size, " threads in milliseconds")
print ("-----", int((time.time()-start_time)*1000), "-----")
In order to execute the code above you should run it like this:
$ qsub -q qexp -l select=4:ncpus=16:mpiprocs=16:ompthreads=1 -I # Salomon: ncpus=24:mpiprocs=24
$ ml Python
$ ml OpenMPI
$ mpiexec -bycore -bind-to-core python hello_world.py
In this example, we run MPI4Py enabled code on 4 nodes, 16 cores per node (total of 64 processes), each python process is bound to a different core.
Sources that may help you:
Submit job with python code (mpi4py) on HPC cluster
https://github.com/JordiCorbilla/mpi4py-examples/tree/master/src/examples/matrix%20multiplication
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With