Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Python's multiprocessing.Process class

This is a newbie question:

A class is an object, so I can create a class called pippo() and inside of this add function and parameter, I don't understand if the functions inside of pippo are executed from up to down when I assign x=pippo() or I must call them as x.dosomething() outside of pippo.

Working with Python's multiprocessing package, is it better to define a big function and create the object using the target argument in the call to Process(), or to create your own process class by inheriting from Process class?

like image 964
user2239318 Avatar asked Jun 18 '13 15:06

user2239318


People also ask

How does multiprocessing process work in Python?

The multiprocessing package supports spawning processes. It refers to a function that loads and executes a new child processes. For the child to terminate or to continue executing concurrent computing,then the current process hasto wait using an API, which is similar to threading module.

What is process join () in Python?

Python multiprocessing join The join method blocks the execution of the main process until the process whose join method is called terminates. Without the join method, the main process won't wait until the process gets terminated.

When should I use multiprocessing in Python?

CPU time gets rationed out between the threads. Multiprocessing is for times when you really do want more than one thing to be done at any given time. Suppose your application needs to connect to 6 databases and perform a complex matrix transformation on each dataset.


1 Answers

I often wondered why Python's doc page on multiprocessing only shows the "functional" approach (using target parameter). Probably because terse, succinct code snippets are best for illustration purposes. For small tasks that fit in one function, I can see how that is the preferred way, ala:

from multiprocessing import Process

def f():
    print('hello')

p = Process(target=f)
p.start()
p.join()

But when you need greater code organization (for complex tasks), making your own class is the way to go:

from multiprocessing import Process

class P(Process):
    def __init__(self):
        super(P, self).__init__()
    def run(self):
        print('hello')

p = P()
p.start()
p.join()

Bear in mind that each spawned process is initialized with a copy of the memory footprint of the master process. And that the constructor code (i.e. stuff inside __init__()) is executed in the master process -- only code inside run() executes in separate processes.

Therefore, if a process (master or spawned) changes it's member variable, the change will not be reflected in other processes. This, of course, is only true for bulit-in types, like bool, string, list, etc. You can however import "special" data structures from multiprocessing module which are then transparently shared between processes (see Sharing state between processes.) Or, you can create your own channels of IPC (inter-process communication) such as multiprocessing.Pipe and multiprocessing.Queue.

like image 92
Velimir Mlaker Avatar answered Oct 27 '22 22:10

Velimir Mlaker