Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Pickle handle multiple object references

Tags:

python

pickle

If I have objects a and b and both reference object obj, what happens when I Pickle and then restore the objects? Will the pickled data 'know' that a and b both referenced the same object and restore everything accordingly, or will the two get two different — and initially equal — objects?

like image 664
Paul Manta Avatar asked Sep 15 '11 16:09

Paul Manta


People also ask

Is pickling used for object serialization?

Pickle in Python is primarily used in serializing and deserializing a Python object structure. In other words, it's the process of converting a Python object into a byte stream to store it in a file/database, maintain program state across sessions, or transport data over the network.

Can all Python objects be pickled?

Python pickle module is used for serializing and de-serializing a Python object structure. Any object in Python can be pickled so that it can be saved on disk.

Is pickling same as serialization?

Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”. The pickle module is not secure. Only unpickle data you trust.

What are the main reasons for using JSON over pickle for serializing objects and data?

JSON is a lightweight format and is much faster than Pickling. There is always a security risk with Pickle. Unpickling data from unknown sources should be avoided as it may contain malicious or erroneous data. There are no loopholes in security using JSON, and it is free from security threats.


2 Answers

Yes, shared objects will only get serialized once (the pickle protocol can even handle circular references).

From the documentation:

The pickle module keeps track of the objects it has already serialized, so that later references to the same object won’t be serialized again. marshal doesn’t do this.

This has implications both for recursive objects and object sharing. Recursive objects are objects that contain references to themselves. These are not handled by marshal, and in fact, attempting to marshal recursive objects will crash your Python interpreter. Object sharing happens when there are multiple references to the same object in different places in the object hierarchy being serialized. pickle stores such objects only once, and ensures that all other references point to the master copy. Shared objects remain shared, which can be very important for mutable objects.

like image 70
NPE Avatar answered Oct 30 '22 01:10

NPE


As @aix points out, pickle understands multiple references to the same object, but only within a single pickling. That is, pickle always pickles a single object. If that object has references within it, those references will be properly shared in the unpickled object.

But if you call pickle twice, to pickle two objects, shared references between the objects will not be preserved properly. The object will now exist twice.

like image 38
Ned Batchelder Avatar answered Oct 30 '22 03:10

Ned Batchelder