I have pickled data from 2.7 that I pickled like this:
#!/usr/bin/env python2
# coding=utf-8
import pickle
data = {1: datetime.date(2014, 3, 18),
'string-key': u'ünicode-string'}
pickle.dump(data, open('file.pickle', 'wb'))
The only way I found to load this in Python 3.4 is:
data = pickle.load(open('file.pickle', "rb"), encoding='bytes')
Now my unicode string are fine but the dict keys are bytes
. print(repr(data))
gives:
{1: datetime.date(2014, 3, 18), b'string-key': 'ünicode-string'}
Does anybody have an idea to get around rewriting my code like data[b'string-key']
resp. converting all existing files?
Python3. # ord() for conversion. In this, task of substitution in unicode formatted string is done using format() and ord() is used for conversion.
Since Python 3.0, the language's str type contains Unicode characters, meaning any string created using "unicode rocks!" , 'unicode rocks!'
In Python 2, the str type was used for two different kinds of values – text and bytes, whereas in Python 3, these are separate and incompatible types. Text contains human-readable messages, represented as a sequence of Unicode codepoints. Usually, it does not contain unprintable control characters such as \0 .
The unicode object lets you work with characters. It has all the same methods as the string object. “encoding” is converting from a unicode object to bytes. “decoding” is converting from bytes to a unicode object.
This is not a real answer but only a workaround. This converts pickled data to version 3 in Python 3.4 (doesn't work in 3.3):
#!/usr/bin/env python3
import pickle, glob
def bytes_to_unicode(ob):
t = type(ob)
if t in (list, tuple):
l = [str(i, 'utf-8') if type(i) is bytes else i for i in ob]
l = [bytes_to_unicode(i) if type(i) in (list, tuple, dict) else i for i in l]
ro = tuple(l) if t is tuple else l
elif t is dict:
byte_keys = [i for i in ob if type(i) is bytes]
for bk in byte_keys:
v = ob[bk]
del(ob[bk])
ob[str(bk,'utf-8')] = v
for k in ob:
if type(ob[k]) is bytes:
ob[k] = str(ob[k], 'utf-8')
elif type(ob[k]) in (list, tuple, dict):
ob[k] = bytes_to_unicode(ob[k])
ro = ob
else:
ro = ob
print("unprocessed object: {0} {1}".format(t, ob))
return ro
for fn in glob.glob('*.pickle'):
data = pickle.load(open(fn, "rb"), encoding='bytes')
ndata = bytes_to_unicode(data)
pickle.dump(ndata, open(fn + '3', "wb"))
The Python docs say:
The pickle serialization format is guaranteed to be backwards compatible across Python releases.
I didn't find a way to pickle.load
Python-2.7 pickled data in Python 3.3 -- not even data that contained only int
s and date
s.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With