Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using multiprocessing.Process with a maximum number of simultaneous processes

I have the Python code:

from multiprocessing import Process  def f(name):     print 'hello', name  if __name__ == '__main__':     for i in range(0, MAX_PROCESSES):         p = Process(target=f, args=(i,))         p.start() 

which runs well. However, MAX_PROCESSES is variable and can be any value between 1 and 512. Since I'm only running this code on a machine with 8 cores, I need to find out if it is possible to limit the number of processes allowed to run at the same time. I've looked into multiprocessing.Queue, but it doesn't look like what I need - or perhaps I'm interpreting the docs incorrectly.

Is there a way to limit the number of simultaneous multiprocessing.Processs running?

like image 557
Brett Avatar asked Jan 02 '14 15:01

Brett


People also ask

Does multiprocessing require multiple cores?

Common research programming languages use only one processor The “multi” in multiprocessing refers to the multiple cores in a computer's central processing unit (CPU). Computers originally had only one CPU core or processor, which is the unit that makes all our mathematical calculations possible.

Does multiprocessing speed up?

Multiprocessing can accelerate execution time by utilizing more of your hardware or by creating a better concurrency pattern for the problem at hand.

How does multiprocessing pool work?

It works like a map-reduce architecture. It maps the input to the different processors and collects the output from all the processors. After the execution of code, it returns the output in form of a list or array. It waits for all the tasks to finish and then returns the output.

What is multiprocessing in Python?

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.


1 Answers

It might be most sensible to use multiprocessing.Pool which produces a pool of worker processes based on the max number of cores available on your system, and then basically feeds tasks in as the cores become available.

The example from the standard docs (http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers) shows that you can also manually set the number of cores:

from multiprocessing import Pool  def f(x):     return x*x  if __name__ == '__main__':     pool = Pool(processes=4)              # start 4 worker processes     result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously     print result.get(timeout=1)           # prints "100" unless your computer is *very* slow     print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]" 

And it's also handy to know that there is the multiprocessing.cpu_count() method to count the number of cores on a given system, if needed in your code.

Edit: Here's some draft code that seems to work for your specific case:

import multiprocessing  def f(name):     print 'hello', name  if __name__ == '__main__':     pool = multiprocessing.Pool() #use all available cores, otherwise specify the number you want as an argument     for i in xrange(0, 512):         pool.apply_async(f, args=(i,))     pool.close()     pool.join() 
like image 97
treddy Avatar answered Oct 11 '22 01:10

treddy