Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing objects with namedtuple - Pickling Error

I am having trouble using namedtuples in objects that I want to put into multiprocessing. I am receiving pickling error. I tried couple of things from other stackoverflow posts, but I could not succeed. Here is the structure of my code:

package_main, test_module

 import myprogram.package_of_classes.data_object_module
 import ....obj_calculate

 class test(object):
       if __name__ == '__main__':
             my_obj=create_obj('myobject',['f1','f2'])
             input = multiprocessing.Queue()
             output = multiprocessing.Queue()
             input.put(my_obj)
             j=Process(target=obj_calculate, args=(input,output))
             j.start()

package_of_classes, data_object_module

 import collections
 import ....load_flat_file

 def get_ntuple_format(obj):
     nt_fields=''
     for fld in obj.fields:
         nt_fields=nt_fields+fld+', '
     nt_fields=nt_fields[0:-2]
     ntuple=collections.namedtuple('ntuple_format',nt_fields)
     return ntuple

 Class Data_obj:
    def __init__(self, name,fields):
        self.name=name
        self.fields=fields
        self.ntuple_form=get_ntuple_format(self)  

    def calculate(self):
        self.file_read('C:/files','division.txt')

    def file_read(self,data_directory,filename):
        output=load_flat_file(data_directory,filename,self.ntuple_form)
        self.data=output

utils_package,utils_module

def create_dataobj(name,fields):
    locals()[name]=Data_Obj(name,fields)
    return locals()[name]  

def obj_calculate(input,output):   
    obj=input.get()
    obj.calculate()
    output.put(obj)

loads_module

def load_flat_file(data_directory,filename,ntuple_form):
     csv.register_dialect('csvrd', delimiter='\t', quoting=csv.QUOTE_NONE)
     ListofTuples=[]
     with open(os.path.join(data_directory,filename), 'rb') as f:
          reader = csv.reader(f,'csvrd')
          for line in reader:
               if line:
                   ListofTuples.append(ntuple_form._make(line))
     return ListofTuples

And the error I am getting is:

PicklingError: PicklingError: Can't pickle  class '__main__ . ntuple_format: it's not the same object as __ main __. ntuple_format

P.S. As I extracted this sample code from a large project, please ignore minor inconsistencies.

like image 341
Enes Avatar asked Mar 10 '14 15:03

Enes


People also ask

Can NamedTuple be pickled?

It is important to remember that namedtuple() is a class factory; you give it parameters and it returns a class object for you to create instances from. pickle only stores the data contained in the instances, plus a string reference to the original class to reconstruct the instances again.

What does NamedTuple return in Python?

NamedTuple can return the values with keys as OrderedDict type object. To make it OrderedDict, we have to use the _asdict() method.

How do I get values from NamedTuple?

Python's namedtuple() is a factory function available in collections . It allows you to create tuple subclasses with named fields. You can access the values in a given named tuple using the dot notation and the field names, like in obj. attr .

When can you change the value of a NamedTuple?

Since a named tuple is a tuple, and tuples are immutable, it is impossible to change the value of a field. In this case, we have to use another private method _replace() to replace values of the field. The _replace() method will return a new named tuple.


2 Answers

You cannot pickle a class (in this case, a named tuple) that you create dynamically (via get_ntuple_format). For a class to be picklable, it has to be defined at the top level of an importable module.

If you only have a few kinds of tuples you need to support, consider defining them all in advance, at the top level of a module, and then picking the right one dynamically. If you need a fully dynamic container format, consider just using a dict instead.

like image 131
Vasiliy Faronov Avatar answered Sep 30 '22 15:09

Vasiliy Faronov


I'd argue you can pickle a namedtuple, as well as a class defined in __main__.

>>> import dill as pickle
>>> import collections
>>> 
>>> thing = collections.namedtuple('thing', ['a','b'])
>>> pickle.loads(pickle.dumps(thing))
<class '__main__.thing'>

Here's the same thing, used in a class method.

>>> class Foo(object):
...   def bar(self, a, b):
...     thing = collections.namedtuple('thing', ['a','b'])     
...     thing.a = a 
...     thing.b = b
...     return thing 
... 
>>> f = Foo()
>>> q = f.bar(1,2)
>>> q.a
1
>>> q.b
2
>>> q._fields
('a', 'b')
>>> 
>>> pickle.loads(pickle.dumps(Foo.bar))
<unbound method Foo.bar>
>>> pickle.loads(pickle.dumps(f.bar))
<bound method Foo.bar of <__main__.Foo object at 0x10dbf5450>>

You just have to use dill instead of pickle.

Get dill here: https://github.com/uqfoundation

like image 22
Mike McKerns Avatar answered Sep 30 '22 16:09

Mike McKerns