python multiprocessing pool retries

Tags:

1 Answers

You can use a Queue to feed back failures into the Pool through a loop in the initiating Process:

import multiprocessing as mp
import random

def f(x):
    if random.getrandbits(1):
        # on failure / exception catch
        f.q.put(x)
        return None
    return x*x

def f_init(q):
    f.q = q

def main(pending):
    total_items = len(pending)
    successful = []
    failure_tracker = []

    q = mp.Queue()
    p = mp.Pool(None, f_init, [q])
    results = p.imap(f, pending)
    retry_results = []
    while len(successful) < total_items:
        successful.extend([r for r in results if not r is None])
        successful.extend([r for r in retry_results if not r is None])
        failed_items = []
        while not q.empty():
            failed_items.append(q.get())
        if failed_items:
            failure_tracker.append(failed_items)
            retry_results = p.imap(f, failed_items);
    p.close()
    p.join()

    print "Results: %s" % successful
    print "Failures: %s" % failure_tracker

if __name__ == '__main__':
    main(range(1, 10))

The output is like this:

Results: [1, 4, 36, 49, 25, 81, 16, 64, 9]
Failures: [[3, 4, 5, 8, 9], [3, 8, 4], [8, 3], []]

A Pool cant be shared between multiple processes. Hence this Queue based approach. If you try to pass a pool as a parameter to the pools processes, you will get this error:

NotImplementedError: pool objects cannot be passed between processes or pickled

You could alternatively try a few immediate retries within your function f, to avoid synchronisation overhead. It really is a matter of how soon your function should wait to retry, and on how likely a success is if retried immediately.

Old Answer: For the sake of completeness, here is my old answer, which isn't as optimal as resubmitting directly into the pool, but might still be relevant depending on the use case, because it provides a natural way to deal with/limit n-level retries:

You can use a Queue to aggregate failures and resubmit at the end of each run, over multiple runs:

import multiprocessing as mp
import random


def f(x):
    if random.getrandbits(1):
        # on failure / exception catch
        f.q.put(x)
        return None
    return x*x

def f_init(q):
    f.q = q

def main(pending):
    run_number = 1
    while pending:
        jobs = pending
        pending = []

        q = mp.Queue()
        p = mp.Pool(None, f_init, [q])
        results = p.imap(f, jobs)
        p.close()

        p.join()
        failed_items = []
        while not q.empty():
            failed_items.append(q.get())
        successful = [r for r in results if not r is None]
        print "(%d) Succeeded: %s" % (run_number, successful)
        print "(%d) Failed:    %s" % (run_number, failed_items)
        print
        pending = failed_items
        run_number += 1

if __name__ == '__main__':
    main(range(1, 10))

with output like this:

(1) Succeeded: [9, 16, 36, 81]
(1) Failed:    [2, 1, 5, 7, 8]

(2) Succeeded: [64]
(2) Failed:    [2, 1, 5, 7]

(3) Succeeded: [1, 25]
(3) Failed:    [2, 7]

(4) Succeeded: [49]
(4) Failed:    [2]

(5) Succeeded: [4]
(5) Failed:    []

194

answered Sep 28 '22 04:09

Preet Kukreti

Related questions
                            
                                Iterate a format string over a list
                            
                                python - list operations
                            
                                Python - datetime of a specific timezone
                            
                                Find all Key-Elements by the same Value in Dicts
                            
                                How do I reverse an itertools.chain object?
                            
                                Python - seek in http response stream
                            
                                Where is the source code for the python 'dict' type?
                            
                                Using doctests from within unittests
                            
                                PyQt (PySide), WebKit and exposing methods from/to Javascript
                            
                                How do I set the width of an Tkinter Entry widget in pixels?
                            
                                What is a good XML stream parser for Python? [closed]
                            
                                Indexable weak ordered set in Python
                            
                                How to catch IndentationError [duplicate]
                            
                                Is wrapping C++ library with ctypes a bad idea?
                            
                                Why does 1.__add__(1) yield a syntax error?
                            
                                How to check whether a tuple exists in a Python list?
                            
                                Equivelant to rindex for lists in Python [duplicate]
                            
                                String splitting in Python using regex
                            
                                Shade 'cells' in polar plot with matplotlib
                            
                                How to restrict setting an attribute outside of constructor?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python multiprocessing pool retries

Tags:

python

multiprocessing

atp

People also ask

1 Answers

Preet Kukreti

Recent Activity

Donate For Us