Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between pickle and shelve?

I am learning about object serialization for the first time. I tried reading and 'googling' for differences in the modules pickle and shelve but I am not sure I understand it. When to use which one? Pickle can turn every python object into stream of bytes which can be persisted into a file. Then why do we need the module shelve? Isn't pickle faster?

like image 549
zubinmehta Avatar asked Nov 05 '10 03:11

zubinmehta


People also ask

Why use shelve?

The shelve module can be used as a simple persistent storage option for Python objects when a relational database is overkill. The shelf is accessed by keys, just as with a dictionary. The values are pickled and written to a database created and managed by anydbm.

Why use shelve in Python?

The shelve module in Python's standard library is a simple yet effective tool for persistent data storage when using a relational database solution is not required. The shelf object defined in this module is dictionary-like object which is persistently stored in a disk file.

What is shelf in Python?

A “shelf” is a persistent, dictionary-like object. The difference with “dbm” databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python objects — anything that the pickle module can handle.


2 Answers

pickle is for serializing some object (or objects) as a single bytestream in a file.

shelve builds on top of pickle and implements a serialization dictionary where objects are pickled, but associated with a key (some string), so you can load your shelved data file and access your pickled objects via keys. This could be more convenient were you to be serializing many objects.

Here is an example of usage between the two. (should work in latest versions of Python 2.7 and Python 3.x).

pickle Example

import pickle  integers = [1, 2, 3, 4, 5]  with open('pickle-example.p', 'wb') as pfile:     pickle.dump(integers, pfile) 

This will dump the integers list to a binary file called pickle-example.p.

Now try reading the pickled file back.

import pickle  with open('pickle-example.p', 'rb') as pfile:     integers = pickle.load(pfile)     print integers 

The above should output [1, 2, 3, 4, 5].

shelve Example

import shelve  integers = [1, 2, 3, 4, 5]  # If you're using Python 2.7, import contextlib and use # the line: # with contextlib.closing(shelve.open('shelf-example', 'c')) as shelf: with shelve.open('shelf-example', 'c') as shelf:     shelf['ints'] = integers 

Notice how you add objects to the shelf via dictionary-like access.

Read the object back in with code like the following:

import shelve  # If you're using Python 2.7, import contextlib and use # the line: # with contextlib.closing(shelve.open('shelf-example', 'r')) as shelf: with shelve.open('shelf-example', 'r') as shelf:     for key in shelf.keys():         print(repr(key), repr(shelf[key])) 

The output will be 'ints', [1, 2, 3, 4, 5].

like image 112
逆さま Avatar answered Oct 13 '22 20:10

逆さま


According to pickle documentation:

Serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The shelve module provides a simple interface to pickle and unpickle objects on DBM-style database files.

like image 24
as - if Avatar answered Oct 13 '22 20:10

as - if