How can I abort a task in a multiprocessing.Pool after a timeout?

Tags:

I am trying to use the multiprocessing package of python in this way:

featureClass = [[1000, k, 1] for k in drange(start, end, step)] #list of arguments
for f in featureClass:
  pool.apply_async(worker, args=f, callback=collectMyResult)
pool.close()
pool.join

From processes of the pool I want to avoid waiting those which take more than 60s to return its result. Is that possible?

470

asked Apr 07 '15 14:04

farhawa

Video Answer

1 Answers

Here's a way you can do this without needing to change your worker function. There are two steps required:

Use the maxtasksperchild option you can pass to multiprocessing.Pool to ensure the worker processes in the pool are restarted after every task execution.
Wrap your existing worker function in another function, which will call worker in a daemon thread, and then wait for a result from that thread for timeout seconds. Using a daemon thread is important because processes won't wait for daemon threads to finish before exiting.

If the timeout expires, you exit (or abort - it's up to you) the wrapper function, which will end the task, and because you've set maxtasksperchild=1, cause the Pool to terminate the worker process and start a new one. This will mean that the background thread doing your real work also gets aborted, because it's a daemon thread, and the process it's living got shut down.

import multiprocessing
from multiprocessing.dummy import Pool as ThreadPool
from functools import partial

def worker(x, y, z):
    pass # Do whatever here

def collectMyResult(result):
    print("Got result {}".format(result))

def abortable_worker(func, *args, **kwargs):
    timeout = kwargs.get('timeout', None)
    p = ThreadPool(1)
    res = p.apply_async(func, args=args)
    try:
        out = res.get(timeout)  # Wait timeout seconds for func to complete.
        return out
    except multiprocessing.TimeoutError:
        print("Aborting due to timeout")
        raise

if __name__ == "__main__":
    pool = multiprocessing.Pool(maxtasksperchild=1)
    featureClass = [[1000,k,1] for k in range(start,end,step)] #list of arguments
    for f in featureClass:
      abortable_func = partial(abortable_worker, worker, timeout=3)
      pool.apply_async(abortable_func, args=f,callback=collectMyResult)
    pool.close()
    pool.join()

Any function that timeouts will raise multiprocessing.TimeoutError. Note that this means your callback won't execute when a timeout occurs. If this isn't acceptable, just change the except block of abortable_worker to return something instead of calling raise.

Also keep in mind that restarting worker processes after every task execution will have a negative impact on the performance of the Pool, due to the increased overhead. You should measure that for your use-case and see if the trade-off is worth it to have the ability to abort the work. If it's a problem, you may need to try another approach, like co-operatively interrupting worker if it has run too long, rather than trying to kill it from the outside. There are many questions on SO that cover this topic.

179

answered Sep 18 '22 14:09

dano

Related questions
                            
                                How to give column names after one-hot encoding with sklearn?
                            
                                Pydantic enum field does not get converted to string
                            
                                Manipulating binary data in Python
                            
                                Get process ID with python
                            
                                How to change behavior of dict() for an instance
                            
                                How to open files given as command line arguments in python? [closed]
                            
                                Finding the maximum of a function
                            
                                How to make python3.2 interpreter the default interpreter in debian
                            
                                Python: Calculate Voronoi Tesselation from Scipy's Delaunay Triangulation in 3D
                            
                                How to handle a long SQL statement string in Python
                            
                                Why is local variable access faster than class member access in Python?
                            
                                How to modify the navigation toolbar easily in a matplotlib figure window?
                            
                                No module named numpy
                            
                                How do I use a minimization function in scipy with constraints
                            
                                Does a slicing operation give me a deep or shallow copy?
                            
                                Pandas and unicode
                            
                                Extremely long wait time when loading REST resource from angularjs
                            
                                Conditionally join a list of strings in Jinja
                            
                                Predicting missing values with scikit-learn's Imputer module
                            
                                How does the list comprehension to flatten a python list work?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I abort a task in a multiprocessing.Pool after a timeout?

Tags:

python

multiprocessing

python-multiprocessing

farhawa

People also ask

Video Answer

1 Answers

dano

Recent Activity

Donate For Us