Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Python Pickle have an illegal character/sequence I can use as a separator?

Tags:

python

pickle

I want to make (and decode) a single string composed of several python pickles.

Is there a character or sequence that is safe to use as a separator in this string?

I should be able to make the string like so:

s = pickle.dumps(o1) + PICKLE_SEPARATOR + pickle.dumps(o2) + PICKLE_SEPARATOR + pickle.dumps(o3) ...

I should be able to take this string and reconstruct the objects like so:

[pickle.loads(s) for s in input.split(PICKLE_SEPARATOR)]

What should PICKLE_SEPARATOR be?


For the curious, I want to send pickled objects to redis using APPEND. (though perhaps I'll just use RPUSH)

like image 630
Lifto Avatar asked Oct 19 '10 22:10

Lifto


2 Answers

It's fine to just catenate the pickles together, Python knows where each one ends

>>> import cStringIO as stringio
>>> import cPickle as pickle
>>> o1 = {}
>>> o2 = []
>>> o3 = ()
>>> p = pickle.dumps(o1)+pickle.dumps(o2)+pickle.dumps(o3)
>>> s = stringio.StringIO(p)
>>> pickle.load(s)
{}
>>> pickle.load(s)
[]
>>> pickle.load(s)
()
like image 100
John La Rooy Avatar answered Oct 19 '22 21:10

John La Rooy


EDIT: First consider gnibbler's answer, which is obviously much simpler. The only reason to prefer the one below is if you want to be able split a sequence of pickles without parsing them.

A fairly safe bet is to use a brand new UUID that you never reuse anywhere else. Evaluate uuid.uuid4().bytes once and store the result in your code as the separator. E.g.:

>>> import uuid
>>> uuid.uuid4().bytes
'\xae\x9fW\xff\x19cG\x0c\xb1\xe1\x1aV%P\xb7\xa8'

Then copy-paste the resulting string literal into your code as the separator (or even just use the one above, if you want). It is pretty much guaranteed that the same sequence will never occur in anything you ever want to store.

like image 35
Marcelo Cantos Avatar answered Oct 19 '22 20:10

Marcelo Cantos