Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mpi4py or multiprocessing in Python ?

I am writing a machine learning toolkit to run algorithm with different settings in parallel (each process run the algorithm for one setting). I am thinking about either to use mpi4py or python's build-in multiprocessing ?

There are a few pros and cons I am considering about.

  1. Easy-to-use:

    • mpi4py: It seems more concepts to learn and a bit more tricks to make it work well
    • multiprocessing: quite easy and clean API
  2. Speed:

    • mpi4py: people say it is more low level, so I am expect it can be faster than python multiprocessing ?
    • multiprocessing: compared with mpi4py, much slower ?
  3. Clean and short code:

    • mpi4py: seems more code to write
    • multiprocessing: preferred, easy to use API

The working context is I am aiming at running the code basically in one computer or a GPU server. Not really targeting at running in different machines in the network (which only MPI can do it).

And since the main goal is doing machine learning, so the parallelization is not really required to be very optimal, the key goal I want to have is to balance easy, clean and quick to maintain code base but at the same time like to exploit the benefits of parallelization.

With the background described above, is it recommended that using multiprocessing should just be enough ? Or is there a very strong reason to use mpi4py ?

like image 302
Xingdong Avatar asked Jun 10 '18 19:06

Xingdong


1 Answers

By using mpi4py you can divide the task into multiple threads, but with a single computer with limited performance or number of cores the usability will be limited. However you might find it handy during training.

mpi4py is constructed on top of the MPI-1/2 specifications and provides an object oriented interface which closely follows MPI-2 C++ bindings.

MPI for Python provides MPI bindings for the Python language, allowing programmers to exploit multiple processor computing systems. MPI for Python supports convenient, pickle-based communication of generic Python object as well as fast, near C-speed, direct array data communication of buffer-provider objects

like image 175
Ansif_Muhammed Avatar answered Sep 28 '22 06:09

Ansif_Muhammed