Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wait for all multiprocessing jobs to finish before continuing

I want to run a bunch of jobs in parallel and then continue once all the jobs are finished. I've got something like

# based on example code from https://pymotw.com/2/multiprocessing/basics.html import multiprocessing import random import time  def worker(num):     """A job that runs for a random amount of time between 5 and 10 seconds."""     time.sleep(random.randrange(5,11))     print('Worker:' + str(num) + ' finished')     return  if __name__ == '__main__':     jobs = []     for i in range(5):         p = multiprocessing.Process(target=worker, args=(i,))         jobs.append(p)         p.start()      # Iterate through the list of jobs and remove one that are finished, checking every second.     while len(jobs) > 0:         jobs = [job for job in jobs if job.is_alive()]         time.sleep(1)      print('*** All jobs finished ***') 

it works, but I'm sure there must be a better way to wait for all the jobs to finish than iterating over them again and again until they are done.

like image 405
Hybrid Avatar asked Nov 26 '15 08:11

Hybrid


People also ask

How do I stop all processes in Python multiprocessing?

You can kill all child processes by first getting a list of all active child processes via the multiprocessing. active_children() function then calling either terminate() or kill() on each process instance.

How does multiprocessing process work?

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.

How do you end a process in multiprocessing?

We can kill or terminate a process immediately by using the terminate() method. We will use this method to terminate the child process, which has been created with the help of function, immediately before completing its execution.

Why multi process is slow?

The multiprocessing version is slower because it needs to reload the model in every map call because the mapped functions are assumed to be stateless. The multiprocessing version looks as follows. Note that in some cases, it is possible to achieve this using the initializer argument to multiprocessing.


2 Answers

What about?

for job in jobs:     job.join() 

This blocks until the first process finishes, then the next one and so on. See more about join()

like image 179
jayant Avatar answered Sep 29 '22 07:09

jayant


You can make use of join. It let you wait for another process to end.

t1 = Process(target=f, args=(x,)) t2 = Process(target=f, args=('bob',))  t1.start() t2.start()  t1.join() t2.join() 

You can also use barrier It works as for threads, letting you specify a number of process you want to wait on and once this number is reached the barrier free them. Here client and server are asumed to be spawn as Process.

b = Barrier(2, timeout=5)  def server():     start_server()     b.wait()     while True:         connection = accept_connection()         process_server_connection(connection)  def client():     b.wait()     while True:         connection = make_connection()         process_client_connection(connection) 

And if you want more functionalities like sharing data and more flow control you can use a manager.

like image 34
Rbtnk Avatar answered Sep 29 '22 05:09

Rbtnk