Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Persistent Processes Post Python Pool

Tags:

python

pool

I have a Python program that takes around 10 minutes to execute. So I use Pool from multiprocessing to speed things up:

from multiprocessing import Pool
p = Pool(processes = 6) # I have an 8 thread processor
results = p.map( function, argument_list ) # distributes work over 6 processes!

It runs much quicker, just from that. God bless Python! And so I thought that would be it.

However I've noticed that each time I do this, the processes and their considerably sized state remain, even when p has gone out of scope; effectively, I've created a memory leak. The processes show up in my System Monitor application as Python processes, which use no CPU at this point, but considerable memory to maintain their state.

Pool has functions close, terminate, and join, and I'd assume one of these will kill the processes. Does anyone know which is the best way to tell my pool p that I am finished with it?

Thanks a lot for your help!

like image 443
user Avatar asked Jan 16 '12 19:01

user


1 Answers

From the Python docs, it looks like you need to do:

p.close()
p.join()

after the map() to indicate that the workers should terminate and then wait for them to do so.

like image 132
dwelch91 Avatar answered Nov 15 '22 10:11

dwelch91