I have a dictionary:
mydict={'öö':1,'ää':2}
I have written it to a pickle file:
a=codecs.open(r'mydict.pkl', 'wb', 'utf-8')
pickle.dump(mydict, a)
If I try to load it:
m=codecs.open(r'mydict.pkl', 'rb', 'utf-8')
mydict = pickle.load(m)
I get an error:
KeyError: u"S'\\xe4\\xe4'\np1\nI2\nsS'\\xf6\\xf6'\np2\nI1\ns."
Any ideas how to solve this? Help is greatly appriciated.
pickle is a binary format, using codec translations before writing will break it. Try to just write to a file and loading it back:
>>> mydict={'öö':1,'ää':2}
>>> mydict
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}
>>> pickle.dump(mydict, open('/tmp/test.pkl', 'wb'))
>>> pickle.load(open('/tmp/test.pkl', 'rb'))
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}
But most probably you want to use Unicode in the first place:
>>> mydict={u'öö':1,u'ää':2}
I believe the problem is the use of codecs.open
. Pickles are binaries not text and codec
is for transparent conversion from some text encoding to unicode. You should just use open
instead.
Old issue but... I have had the same problem and I didn't think extra disk IO is a fine solution. I suggest you using base64 encode/decoding.
import base64
serialized_str = base64.b64encode(pickle.dumps(mydict))
my_obj_back = pickle.loads(base64.b64decode(serialized_str))
Even cPickle could be used same way for faster results in batches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With