Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python how to do multiprocessing inside of a class?

I have a code structure that looks like this:

Class A:   def __init__(self):     processes = []     for i in range(1000):       p = Process(target=self.RunProcess, args=i)       processes.append[p]      # Start all processes     [x.start() for x in processes]    def RunProcess(self, i):     do something with i... 

Main script:

myA = A() 

I can't seem to get this to run. I get a runtime error "An attempt has been made to start a new process before the current process has finished its bootstrapping phase."

How do I get multiple processing working for this? If I use Threading, it works fine but it is as slow as sequential... And I'm also afraid that multiple processing will also be slow because it takes longer for the the process to be created?

Any good tips? Many thanks in advance.

like image 888
TheBear Avatar asked Mar 12 '15 12:03

TheBear


People also ask

How do you perform multiple processes in Python?

In this example, at first we import the Process class then initiate Process object with the display() function. Then process is started with start() method and then complete the process with the join() method. We can also pass arguments to the function using args keyword.

Which is better multiprocessing or multithreading in Python?

Multiprocessing is a easier to just drop in than threading but has a higher memory overhead. If your code is CPU bound, multiprocessing is most likely going to be the better choice—especially if the target machine has multiple cores or CPUs.

What is pool in multiprocessing Python?

The Pool class in multiprocessing can handle an enormous number of processes. It allows you to run multiple jobs per process (due to its ability to queue the jobs). The memory is allocated only to the executing processes, unlike the Process class, which allocates memory to all the processes.

What does pool map do?

The pool's map method chops the given iterable into a number of chunks which it submits to the process pool as separate tasks. The pool's map is a parallel equivalent of the built-in map method. The map blocks the main execution until all computations finish. The Pool can take the number of processes as a parameter.


1 Answers

There are a couple of syntax issues that I can see in your code:

  • args in Process expects a tuple, you pass an integer, please change line 5 to:

    p = Process(target=self.RunProcess, args=(i,))

  • list.append is a method and arguments passed to it should be enclosed in (), not [], please change line 6 to:

    processes.append(p)

As @qarma points out, its not good practice to start the processes in the class constructor. I would structure the code as follows (adapting your example):

import multiprocessing as mp from time import sleep  class A(object):     def __init__(self, *args, **kwargs):         # do other stuff         pass      def do_something(self, i):         sleep(0.2)         print('%s * %s = %s' % (i, i, i*i))      def run(self):         processes = []          for i in range(1000):             p = mp.Process(target=self.do_something, args=(i,))             processes.append(p)          [x.start() for x in processes]   if __name__ == '__main__':     a = A()     a.run() 
like image 78
Haleemur Ali Avatar answered Sep 20 '22 21:09

Haleemur Ali