I have custom list and dictionary classes that no longer work while unpickling in Python 3.7.
import pickle
class A(dict):
pass
class MyList(list):
def __init__(self, iterable=None, option=A):
self.option=option
if iterable:
for x in iterable:
self.append(x)
def append(self, obj):
if isinstance(obj, dict):
obj = self.option(obj)
super(MyList, self).append(obj)
def extend(self, iterable):
for item in iterable:
self.append(item)
if __name__ == '__main__':
pickle_file = 'test_pickle'
my_list = MyList([{'a': 1}])
pickle.dump(my_list, open(pickle_file, 'wb'))
loaded = pickle.load(open(pickle_file, 'rb'))
print(isinstance(loaded[0], A))
Works fine on Python 2.6 through 3.6:
"C:\Program Files\Python36\python.exe" issue.py
True
But is no longer setting the self.option
properly in 3.7.
"C:\Program Files\Python37\python.exe" issue.py
Traceback (most recent call last):
File "issue.py", line 28, in <module>
loaded = pickle.load(open(pickle_file, 'rb'))
File "issue.py", line 21, in extend
self.append(item)
File "issue.py", line 16, in append
obj = self.option(obj)
AttributeError: 'MyList' object has no attribute 'option'
If I were to remove the extend
function, it works as expected though.
I have tried adding __setstate__
as well, but it is not called before extend
so the option
is still undefined at that point.
I do have to inherit directly from dict
and list
, and I do need to overwrite both the append
and extend
function in my code. Is there a way to set option
beforehand or another fix? Is this change in behavior documented and the rational for it?
Thank you for your time
Unpickling list objects switched from using list.append()
to list.extend()
, because that can be way faster for some list
subclasses.
However, with that change, the way that the unpickling code tested for list objects also changed, from
if (PyList_Check(list)) {
to
if (PyList_CheckExact(list)) {
It is that change that affects your code. The above test looks for a fast path, saying if we have a list class, then use PyList_SetSlice()
to load the data, rather than a slower path of explicitly calling either the .extend()
or .append()
method on the new instance. The old version (Python 3.6 and older) accepted lists and subclasses, the new version only accepts list
itself, not subclasses!
So for Python 3.6 and older, when unpickling your custom MyList.append()
method is not called, purely because you subclassed list
. In Python 3.7, when unpickling your custom MyList.extend()
method is called. This is very much intentional, subclasses should be allowed to provide a custom .extend()
method that gets to be called when unpickling.
And the work-around is simple. Your data is already wrapped when unpickling, you don't need to re-apply that wrapper. When you do not have self.option
set, simply skip applying it:
def append(self, obj):
if isinstance(obj, dict):
try:
obj = self.option(obj)
except AttributeError:
# something's wrong, are we unpickling on Python 3.7 or newer?
if 'option' in self.__dict__:
# no, we are not, because 'option' has been set, this must
# be an error in the option() call, so re-raise
raise
# yes, we are, just ignore this, obj is already wrapped
super(MyList, self).append(obj)
This all does mean you can't rely on any instance attributes having been restored yet. If that's a big problem (you still need to consult instance state while unpickling), then you'll have to provide a different __reduce_ex__
method, one that doesn't return the data as an iterator in index 3 of the resulting tuple. list().__reduce_ex__()
for protocol versions 2, 3 and 4 returns (copyreg.__newobj__, type(self), self.__dict__, iter(self), None)
.
A custom version would have to use (type(self), (tuple(self), self.option), None, None, None)
, for example. That does come with some additional overhead (that tuple(self)
there will take additional memory when pickling and unpickling).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With