Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: TypeError: Pickling an AuthenticationString object is disallowed for security reasons

I'm creating an object of a class(with multiprocessing) and adding it to a Manager.dict() so that I can delete the item from the dictionary inside the object (the item points to) when its work completes..

I tried the following code:

from multiprocessing import Manager, Process

class My_class(Process):
    def __init__(self):
        super(My_class, self).__init__()
        print "Object", self, "created."

    def run(self):
        print "Object", self, "process started."


manager=Manager()
object_dict=manager.dict()

for x in range(2):
    object_dict[x]=My_class()
    object_dict[x].start()

But I got an error:

TypeError: Pickling an AuthenticationString object is disallowed
for security reasons

For curiosity, I removed the multiprocessing part, and tried like:

from multiprocessing import Manager
class My_class():
    def __init__(self):
        print "Object", self, "created."

manager=Manager()
object_dict=manager.dict()

for x in range(2):
    object_dict[x]=My_class()

and it's giving me no errors and displaying the addresses of two objects.

What's that error and how to make it go away?

like image 928
RatDon Avatar asked Mar 12 '15 10:03

RatDon


2 Answers

Here is a shorter way to replicate the effect you are seeing:

from multiprocessing import Process
import pickle

p = Process()
pickle.dumps(p._config['authkey'])

TypeError: Pickling an AuthenticationString object is disallowed for security reasons

What is actually happening here is the following: the process._config['authkey'] is the secret key that the Process object gets assigned on creation. Although this key is nothing more but a sequence of bytes, Python uses a special subclass of bytes to represent it: AuthenticationString. This subclass differs from the usual bytes in only one aspect - it refuses to be pickled.

The rationale behind this choice is the following: the authkey is used for authenticating inter-process communication messages between parent and child processes (e.g. between the workers and the main process) and exposing it anywhere outside the initial process family could pose a security risk (because you could, in principle, impersonate a "parent process" for the worker and force it into executing arbitrary code). As pickling is the most common form of data transfer in Python, prohibiting it is a simple way of an unintended exposure of the authkey.

As you cannot pickle an AuthenticationString, you also cannot pickle instances of Process class or any of its subclasses (because all of them contain an authentication key in a field).

Now let us take a look at how it all relates to your code. You create a Manager object and attempt to set the values of its dict. The Manager actually runs in a separate process and whenever you assign any data to manager.dict(), Python needs to transfer this data to the Manager's own process. For that transfer to happen, the data is being pickled. But, as we know from the previous paragraphs, you cannot pickle Process objects and hence cannot keep them in a shared dict at all.

In short, you are free to use manager.dict() to share any objects, except those which cannot be pickled, such as the Process objects.

like image 189
KT. Avatar answered Oct 20 '22 06:10

KT.


Note: the solution below is in Python3 aka print(). The same issue exists in Python3 also.

Well, in your specific example, we can work around the problem by pickling the AuthenticationString inside the _config dict that's part of the Process object as a bytes buffer and then gracefully restoring it when unpickling as if nothing happened. Define the get and set state methods that are called for pickling ops as follows inside My_class:

from multiprocessing import Manager, Process
from multiprocessing.process import AuthenticationString

class My_class(Process):
    def __init__(self):
        super(My_class, self).__init__()
        print("Object", self, "created.")

    def run(self):
        print("Object", self, "process started.")

    def __getstate__(self):
        """called when pickling - this hack allows subprocesses to 
           be spawned without the AuthenticationString raising an error"""
        state = self.__dict__.copy()
        conf = state['_config']
        if 'authkey' in conf: 
            #del conf['authkey']
            conf['authkey'] = bytes(conf['authkey'])
        return state

    def __setstate__(self, state):
        """for unpickling"""
        state['_config']['authkey'] = AuthenticationString(state['_config']['authkey'])
        self.__dict__.update(state)

if __name__ == '__main__': # had to add this
    manager=Manager()
    object_dict=manager.dict()
    for x in range(2):
        object_dict[x]=My_class()
        object_dict[x].start()

I get the following output from running the code:

Object <My_class(My_class-2, initial)> created.
Object <My_class(My_class-3, initial)> created.
Object <My_class(My_class-2, started)> process started.
Object <My_class(My_class-3, started)> process started.

Which appears to be the intended outcome, and if you put a time.sleep() call in to keep them alive a bit longer, you can see the two sub-processess running.

Alternatively, it doesn't seem to upset anything if you simply delete that _config authkey and then you don't even need to define a custom __setstate__ method.

Also, note that I had to add in __main__ - without it python complained about not having finished its bootstrapping before launching sub processes.

Finally, I just have to shrug my shoulders at this whole "security" thing. It pops up in all sorts of places (with the same type of work-around required) and doesn't provide any real security.

like image 1
Dmytro Bugayev Avatar answered Oct 20 '22 06:10

Dmytro Bugayev