Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom exceptions are not raised properly when used in Multiprocessing Pool

Question

I am observing behavior in Python 3.3.4 that I would like help understanding: Why are my exceptions properly raised when a function is executed normally, but not when the function is executed in a pool of workers?

Code

import multiprocessing

class AllModuleExceptions(Exception):
    """Base class for library exceptions"""
    pass

class ModuleException_1(AllModuleExceptions):
    def __init__(self, message1):
        super(ModuleException_1, self).__init__()
        self.e_string = "Message: {}".format(message1)
        return

class ModuleException_2(AllModuleExceptions):
    def __init__(self, message2):
        super(ModuleException_2, self).__init__()
        self.e_string = "Message: {}".format(message2)
        return

def func_that_raises_exception(arg1, arg2):
    result = arg1 + arg2
    raise ModuleException_1("Something bad happened")

def func(arg1, arg2):

    try:
        result = func_that_raises_exception(arg1, arg2)

    except ModuleException_1:
        raise ModuleException_2("We need to halt main") from None

    return result

pool = multiprocessing.Pool(2)
results = pool.starmap(func, [(1,2), (3,4)])

pool.close()
pool.join()

print(results)

This code produces this error:

Exception in thread Thread-3:
Traceback (most recent call last):
   File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/threading.py", line 921, in _bootstrap_inner
    self.run()  
File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/threading.py", line 869, in run
    self._target(*self._args, **self._kwargs)
  File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/multiprocessing/pool.py", line 420, in _handle_results
    task = get()
  File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/multiprocessing/connection.py", line 251, in recv
    return ForkingPickler.loads(buf.getbuffer()) TypeError: __init__() missing 1 required positional argument: 'message2'

Conversely, if I simply call the function, it seems to handle the exception properly:

print(func(1, 2))

Produces:

Traceback (most recent call last):
  File "exceptions.py", line 40, in
    print(func(1, 2))
  File "exceptions.py", line 30, in func
    raise ModuleException_2("We need to halt main") from None
__main__.ModuleException_2

Why does ModuleException_2 behave differently when it is run in a process pool?

like image 807
skrrgwasme Avatar asked Jan 16 '15 22:01

skrrgwasme


People also ask

How does Python handle exceptions in multiprocessing?

Exception Handling in Worker Initialization This can be set via the “initializer” argument to specify the function name and “initargs” to specify a tuple of arguments to the function. Each process started by the process pool will call your initialization function before starting the process.

How do processes pools work in multiprocessing?

Pool is generally used for heterogeneous tasks, whereas multiprocessing. Process is generally used for homogeneous tasks. The Pool is designed to execute heterogeneous tasks, that is tasks that do not resemble each other. For example, each task submitted to the process pool may be a different target function.

How do you pass multiple arguments in a multiprocessing pool?

Use Pool.The multiprocessing pool starmap() function will call the target function with multiple arguments. As such it can be used instead of the map() function. This is probably the preferred approach for executing a target function in the multiprocessing pool that takes multiple arguments.

When would you use a multiprocessing pool?

Use the multiprocessing pool if your tasks are independent. This means that each task is not dependent on other tasks that could execute at the same time. It also may mean tasks that are not dependent on any data other than data provided via function arguments to the task.


1 Answers

The issue is that your exception classes have non-optional arguments in their __init__ methods, but that when you call the superclass __init__ method you don't pass those arguments along. This causes a new exception when your exception instances are unpickled by the multiprocessing code.

This has been a long-standing issue with Python exceptions, and you can read quite a bit of the history of the issue in this bug report (in which a part of the underlying issue with pickling exceptions was fixed, but not the part you're hitting).

To summarize the issue: Python's base Exception class puts all the arguments it's __init__ method receives into an attribute named args. Those arguments are put into the pickle data and when the stream is unpickled, they're passed to the __init__ method of the newly created object. If the number of arguments received by Exception.__init__ is not the same as a child class expects, you'll get at error at unpickling time.

A workaround for the issue is to pass all the arguments you custom exception classes require in their __init__ methods to the superclass __init__:

class ModuleException_2(AllModuleExceptions):
    def __init__(self, message2):
        super(ModuleException_2, self).__init__(message2) # the change is here!
        self.e_string = "Message: {}".format(message2)

Another possible fix would be to not call the superclass __init__ method at all (this is what the fix in the bug linked above allows), but since that's usually poor behavior for a subclass, I can't really recommend it.

like image 166
Blckknght Avatar answered Sep 26 '22 04:09

Blckknght