I don't understand how __reduce__
function exactly works in case of pickle
module in Python.
Suppose that I have the following class:
class Foo(object):
def __init__(self, file_name = 'file.txt'):
self.file_name = file_name
self.f = open(self.file_name, 'w')
It can't be pickled because pickle
module doesn't know how to encode file handle:
foo = Foo()
print(pickle.dumps(foo))
Output:
TypeError: can't pickle file objects
But if I add the __reduce__
function it successfully encodes:
import pickle
class Foo(object):
def __init__(self, file_name = 'file.txt'):
self.file_name = file_name
self.f = open(self.file_name, 'w')
def __reduce__(self):
return (self.__class__, (self.file_name, ))
foo = Foo()
print(pickle.dumps(foo))
Output:
c__main__
Foo
p0
(S'file.txt'
p1
tp2
Rp3
.
Am I right that the __reduce__
function simply returns "instructions" for the deconstructor to re-create the original object if the pickle.dumps
call failed?
It is unclear for me from the documentation.
Whenever an object is pickled, the __reduce__ method defined by it gets called. This method returns either a string, which may represent the name of a Python global, or a tuple describing how to reconstruct this object when unpickling.
As mentioned earlier dump() and load() functions of pickle module perform pickling and unpickling Python data.
To use pickle, start by importing it in Python. To pickle this dictionary, you first need to specify the name of the file you will write it to, which is dogs in this case. Note that the file does not have an extension. To open the file for writing, simply use the open() function.
You can use the loads() method to unpickle an object that is pickled in the form of a string using the dumps() method, instead of being stored on a disk via the the dump() method. In the following example the car_list object that was pickled to a car_list string is unpickled via the loads() method.
Let’s look into some examples of using the pickle module in Python. Since a file consists of bytes of information, we can transform a Python object into a file through the pickle module. This is called pickling. Let us look at how we could do this through an example.
Although the developer might be working with python, they should know that the pickle module is advanced now. This means that if the developer has pickled the object with some specific version of python, they might not be able to unpickle the object with the previous version.
Whereas, the pickle module is useful when dealing with objects of classes, both built-in and user-defined. It helps in storing objects into “.dat” files very easily without delving into its details. The programmer doesn’t have to know the number of bytes to be allocated for each object and search for an object byte-wise and so on.
Pickle won't know how to handle the object and will throw an error. You can tell the pickle module how to handle these types of objects natively within a class directly. Lets see an example of an object which has a single property; an open file handle:
You're right. The __reduce__
method should return hints how to reconstruct (unpickle) the object in case it cannot be pickled automatically. It may contain an object reference and parameters with which it will be called to create an initial version of the object, object's state, etc.
From the documentation:
If a string is returned, the string should be interpreted as the name of a global variable. It should be the object’s local name relative to its module; the pickle module searches the module namespace to determine the object’s module. This behaviour is typically useful for singletons.
When a tuple is returned, it must be between two and five items long. Optional items can either be omitted, or None can be provided as their value. The semantics of each item are in order:
- A callable object that will be called to create the initial version of the object.
- A tuple of arguments for the callable object. An empty tuple must be given if the callable does not accept any argument.
- Optionally, the object’s state, which will be passed to the object’s
__setstate__()
method as previously described. If the object has no such method then, the value must be a dictionary and it will be added to the object’s__dict__
attribute.- Optionally, an iterator (and not a sequence) yielding successive items. These items will be appended to the object either using
obj.append(item)
or, in batch, usingobj.extend(list_of_items)
. This is primarily used for list subclasses, but may be used by other classes as long as they haveappend()
andextend()
methods with the appropriate signature. (Whetherappend()
orextend()
is used depends on which pickle protocol version is used as well as the number of items to append, so both must be supported.)- Optionally, an iterator (not a sequence) yielding successive key-value pairs. These items will be stored to the object using
obj[key] = value
. This is primarily used for dictionary subclasses, but may be used by other classes as long as they implement__setitem__()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With