Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling a subprocess within a script using mpi4py

I’m having trouble calling an external program from my python script in which I want to use mpi4py to distribute the workload among different processors.

Basically, I want to use my script such that each core prepares some input files for calculations in separate folders, then starts an external program in this folder, waits for the output, and then, finally, reads the results and collects them.

However, I simply cannot get the external program call to work. On my search for a solution to this problem I've found that the problems I'm facing seem to be quite fundamental. The following simple example makes this clear:

#!/usr/bin/env python
import subprocess

subprocess.call(“EXTERNAL_PROGRAM”, shell=True)
subprocess.call(“echo test”, shell=True)

./script.py works fine (both calls work), while mpirun -np 1 ./script.py only outputs test. Is there any workaround for this situation? The program is definitely in my PATH, but it also fails if I use the abolute path for the call.

This SO question seems to be related, sadly there are no answers...

EDIT:

In the original version of my question I’ve not included any code using mpi4py, even though I mention this module in the title. So here is a more elaborate example of the code:

#!/usr/bin/env python

import os
import subprocess

from mpi4py import MPI


def worker(parameter=None):
    """Make new folder, cd into it, prepare the config files and execute the
    external program."""

    cwd = os.getcwd()
    dir = "_calculation_" + parameter
    dir = os.path.join(cwd, dir)
    os.makedirs(dir)
    os.chdir(dir)

    # Write input for simulation & execute
    subprocess.call("echo {} > input.cfg".format(parameter), shell=True)
    subprocess.call("EXTERNAL_PROGRAM", shell=True)

    # After the program is finished, do something here with the output files
    # and return the data. I'm using the input parameter as a dummy variable
    # for the processed output.
    data = parameter

    os.chdir(cwd)

    return data


def run_parallel():
    """Iterate over job_args in parallel."""

    comm = MPI.COMM_WORLD
    size = comm.Get_size()
    rank = comm.Get_rank()

    if rank == 0:
        # Here should normally be a list with many more entries, subdivided
        # among all the available cores. I'll keep it simple here, so one has
        # to run this script with mpirun -np 2 ./script.py
        job_args = ["a", "b"]
    else:
        job_args = None

    job_arg = comm.scatter(job_args, root=0)
    res = worker(parameter=job_arg)
    results = comm.gather(res, root=0)

    print res
    print results

if __name__ == '__main__':
    run_parallel()

Unfortunately I cannot provide more details of the external executable EXTERNAL_PROGRAM other than that it is a C++ application which is MPI enabled. As written in the comment section below, I suspect that this is the reason (or one of the resons) why my external program call is basically ignored.

Please note that I’m aware of the fact that in this situation, nobody can reproduce my exact situation. Still, however, I was hoping that someone here already ran into similar problems and might be able to help.

For completeness, the OS is Ubuntu 14.04 and I’m using OpenMPI 1.6.5.

like image 569
nilfisque Avatar asked Nov 10 '22 21:11

nilfisque


1 Answers

In your first example you might be able to do this:

#!/usr/bin/env python
import subprocess

subprocess.call(“EXTERNAL_PROGRAM && echo test”, shell=True)

The python script is only facilitating the MPI call. You could just as well write a bash script with command “EXTERNAL_PROGRAM && echo test” and mpirun the bash script; it would be equivalent to mpirunning the python script.

The second example will not work if EXTERNAL_PROGRAM is MPI enabled. When using mpi4py it will initialize the MPI. You cannot spawn another MPI program once you initialized the MPI environment in such a manner. You could spawn using MPI_Comm_spawn or MPI_Comm_spawn_multiple and -up option to mpirun. For mpi4py refer to Compute PI example for spawning (use MPI.COMM_SELF.Spawn).

like image 145
Al Conrad Avatar answered Nov 14 '22 22:11

Al Conrad