I chose to use pickle (+base64+TCP sockets) to communicate data between my python3 code and legacy python2 code, but I am having trouble with datetime
objects:
The PY3 object unpickles well on PY2, but the reverse raises a TypeError
when calling the datetime constructor, then a UnicodeEncodeError
in the load_reduce function.
A short test program & the log, including dis output of both PY2 and PY3 pickles, are available in this gist
I am using pickle.dumps(reply, protocol=2)
in PY2
then pickle._loads(pickled, fix_imports=True, encoding='latin1')
in PY3
(tried None and utf-8 without success)
Native cPickle loads
decoding fails too, I am only using pure python's _loads
for debugging.
Is this a datetime
bug ? Maybe datetime.__getstate__/__setstate__
implementations are not compatible ?
Any remark on the code is welcome...
PY-3.4.0 pickle:
0: \x80 PROTO 2
2: c GLOBAL 'datetime datetime'
21: q BINPUT 0
23: c GLOBAL '_codecs encode'
39: q BINPUT 1
41: X BINUNICODE u'\x07\xde\x07\x11\x0f\x06\x11\x05\n\x90'
58: q BINPUT 2
60: X BINUNICODE u'latin1'
71: q BINPUT 3
73: \x86 TUPLE2
74: q BINPUT 4
76: R REDUCE
77: q BINPUT 5
79: \x85 TUPLE1
80: q BINPUT 6
82: R REDUCE
83: q BINPUT 7
85: . STOP
PY-2.7.6 pickle:
0: \\x80 PROTO 2
2: c GLOBAL 'datetime datetime'
21: q BINPUT 0
23: U SHORT_BINSTRING '\\x07\xc3\x9e\\x07\\x11\\x0f\\x06\\x11\\x05\\n\\x90'
35: q BINPUT 1
37: \\x85 TUPLE1
38: q BINPUT 2
40: R REDUCE
41: q BINPUT 3
43: ] EMPTY_LIST
44: q BINPUT 4
46: N NONE
47: \\x87 TUPLE3
48: q BINPUT 5
50: . STOP
PY-3.4.0 pickle.load_reduce
:
def load_reduce(self):
stack = self.stack
args = stack.pop()
func = stack[-1]
try:
value = func(*args)
except:
print(sys.exc_info())
print(func, args)
raise
stack[-1] = value
dispatch[REDUCE[0]] = load_reduce
PY-3.4.0 datetime
pickle support:
# Pickle support.
def _getstate(self):
yhi, ylo = divmod(self._year, 256)
us2, us3 = divmod(self._microsecond, 256)
us1, us2 = divmod(us2, 256)
basestate = bytes([yhi, ylo, self._month, self._day,
self._hour, self._minute, self._second,
us1, us2, us3])
if self._tzinfo is None:
return (basestate,)
else:
return (basestate, self._tzinfo)
def __setstate(self, string, tzinfo):
(yhi, ylo, self._month, self._day, self._hour,
self._minute, self._second, us1, us2, us3) = string
self._year = yhi * 256 + ylo
self._microsecond = (((us1 << 8) | us2) << 8) | us3
if tzinfo is None or isinstance(tzinfo, _tzinfo_class):
self._tzinfo = tzinfo
else:
raise TypeError("bad tzinfo state arg %r" % tzinfo)
def __reduce__(self):
return (self.__class__, self._getstate())
The workaround is to use the encoding="bytes"
like this:
pickled_bytes = bytes(pickled_str, encoding='latin1') # If your input is a string(not my case)
data = pickle.loads(pickled_bytes, encoding='bytes')
(Thanks to Tim Peters for the suggestion)
Issue still opened at http://bugs.python.org/issue22005 as to why this is required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With