I am using Python's multiprocessing.Pool class to distribute tasks among processes.
The simple case works as expected:
from multiprocessing import Pool
def evaluate:
do_something()
pool = Pool(processes=N)
for task in tasks:
pool.apply_async(evaluate, (data,))
N processes are spawned, and they continually work through the tasks that I pass into apply_async. Now, I have another case where I have many different very complex objects which each need to do computationally heavy activity. I initially let each object create its own multiprocessing.Pool on demand at the time it was completing work, but I eventually ran into OSError for having too many files open, even though I would have assumed that the pools would get garbage collected after use.
At any rate, I decided it would be preferable anyway for each of these complex objects to share the same Pool for computations:
from multiprocessing import Pool
def evaluate:
do_something()
pool = Pool(processes=N)
class ComplexClass:
def work:
for task in tasks:
self.pool.apply_async(evaluate, (data,))
objects = [ComplexClass() for i in range(50)]
for complex in objects:
complex.pool = pool
while True:
for complex in objects:
complex.work()
Now, when I run this on one of my computers (OS X, Python=3.4), it works just as expected. N processes are spawned, and each complex object distributes their tasks among each of them. However, when I ran it on another machine (Google Cloud instance running Ubuntu, Python=3.5), it spawns an enormous number of processes (>> N) and the entire program grinds to a halt due to contention.
If I check the pool for more information:
import random
random_object = random.sample(objects, 1)
print (random_object.pool.processes)
>>> N
Everything looks correct. But it's clearly not. Any ideas what may be going on?
UPDATE
I added some additional logging. I set the pool size to 1 for simplicity. Within the pool, as a task is being completed, I print the current_process() from the multiprocessing module, as well as the pid of the task using os.getpid(). It results in something like this:
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
...
Again, looking at actually activity using htop, I'm seeing many processes (one per object sharing the multiprocessing pool) all consuming CPU cycles as this is happening, resulting in so much OS contention that progress is very slow. 5122 appears to be the parent process.
If you implement an infinite loop, then it will run like an infinite loop. Your example (which does not work at all due to other reasons) ...
while True:
for complex in objects:
complex.work()
Even though your code above shows only some snippets, you cannot expect the same results on Windows / MacOS on the one hand and Linux on the other. The former spawn processes, the latter fork them. If you use global variables which can have state, you will run into troubles when developing on one environment and running on the other.
Make sure, not to use global statefull variables in your processes. Just pass them explicitly or get rid of them in another way.
Write a program with the minimal requirement to have a __main__
. Especially, when you use Multiprocessing you need this. Instantiate your Pool in that namespace.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With