Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create global lock/semaphore with multiprocessing.pool in Python?

Tags:

I want limit resource access in children processes. For example - limit http downloads, disk io, etc.. How can I achieve it expanding this basic code?

Please share some basic code examples.

pool = multiprocessing.Pool(multiprocessing.cpu_count()) while job_queue.is_jobs_for_processing():   for job in job_queue.pull_jobs_for_processing:     pool.apply_async(do_job, callback = callback) pool.close() pool.join() 
like image 453
Chameleon Avatar asked Feb 22 '15 23:02

Chameleon


People also ask

How lock in multiprocessing Python?

Python provides a mutual exclusion lock for use with processes via the multiprocessing. Lock class. An instance of the lock can be created and then acquired by processes before accessing a critical section, and released after the critical section. Only one process can have the lock at any time.

What is semaphore in Python multiprocessing?

Python Multithread A semaphore manages an internal counter which is decremented by each acquire() call and incremented by each release() call. The counter can never go below zero; when acquire() finds that it is zero, it blocks, waiting until some other thread calls release().

What is pool in multiprocessing Python?

The Pool class in multiprocessing can handle an enormous number of processes. It allows you to run multiple jobs per process (due to its ability to queue the jobs). The memory is allocated only to the executing processes, unlike the Process class, which allocates memory to all the processes.

How does Python multiprocess work?

The multiprocessing Python module contains two classes capable of handling tasks. The Process class sends each task to a different processor, and the Pool class sends sets of tasks to different processors.


1 Answers

Use the initializer and initargs arguments when creating a pool so as to define a global in all the child processes.

For instance:

from multiprocessing import Pool, Lock from time import sleep  def do_job(i):     "The greater i is, the shorter the function waits before returning."     with lock:         sleep(1-(i/10.))         return i  def init_child(lock_):     global lock     lock = lock_  def main():     lock = Lock()     poolsize = 4     with Pool(poolsize, initializer=init_child, initargs=(lock,)) as pool:         results = pool.imap_unordered(do_job, range(poolsize))         print(list(results))  if __name__ == "__main__":     main() 

This code will print out the numbers 0-3 in ascending order (the order in which the jobs were submitted), because it uses the lock. Comment out the with lock: line to see it print out the numbers in descending order.

This solution works both on windows and unix. However, because processes can fork on unix systems, unix only need to declare global variables at the module scope. The child process gets a copy of the parent's memory, which includes the lock object which still works. Thus the initializer isn't strictly needed, but it can help document how the code is intended to work. When multiprocessing is able to create processes by forking, then the following also works.

from multiprocessing import Pool, Lock from time import sleep  lock = Lock()  def do_job(i):     "The greater i is, the shorter the function waits before returning."     with lock:         sleep(1-(i/10.))         return i  def main():     poolsize = 4     with Pool(poolsize) as pool:         results = pool.imap_unordered(do_job, range(poolsize))         print(list(results))  if __name__ == "__main__":     main() 
like image 78
Dunes Avatar answered Sep 20 '22 10:09

Dunes