Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Pickle a python dictionary into MySQL?

I looked through several SO-Questions for how to pickle a python object and store it into a database. The information I collected is:

  • import pickle or import cpickle. Import the latter, if performance is an issue.
  • Assume dict is a python dictionary (or what so ever python object): pickled = pickle.dumps(dict).
  • store pickled into a MySQL BLOB Column using what so ever module to communicate with Database.
  • Get it out again. And use pickle.loads(pickled) to restore the python dictionary.

I just want to make sure I understood this right. Did I miss something critical? Are there sideeffects? Is it really that easy?

Background-Info: The only thing I want to do, is store Googlegeocoder-Responses, which are nested python dictionarys in my case. I am only using a little part of the response object and I don't know if I will ever need more of it later on. That's why I thought of storing the response to save me repetition of some million querys.

like image 921
Aufwind Avatar asked Aug 19 '11 05:08

Aufwind


People also ask

Can you pickle a dictionary Python?

In general, pickling a dict will fail unless you have only simple objects in it, like strings and integers. Even a really simple dict will often fail.

Can I use pickle as database?

Pickle in Python is primarily used in serializing and deserializing a Python object structure. In other words, it's the process of converting a Python object into a byte stream to store it in a file/database, maintain program state across sessions, or transport data over the network.

How do you pickle in Python?

First, import pickle to use it, then we define an example dictionary, which is a Python object. Next, we open a file (note that we open to write bytes in Python 3+), then we use pickle. dump() to put the dict into opened file, then close. Use pickle.


2 Answers

It's really that easy... so long as you don't need your DB to know anything about the dictionary. If you need any sort of structured data access to the contents of the dictionary, then you're going to have to get more involved.

Another gotcha might be what you intend to put in the dict. Python's pickle serialization is quite intelligent and can handle most cases without any need for adding custom support. However, when it doesn't work, it can be very difficult to understand what's gone wrong. So if you can, restrict the contents of the dict to Python's built-in types. If you start adding instances of custom classes, keep them to simple custom classes that don't do any funny stuff with attribute storage or access. And beware of adding instances of classes or types from add-ons. In general, if you start running into hard-to-understand problems with the pickling or unpickling, look at the non-built-in types in the dict.

like image 100
Ross Patterson Avatar answered Sep 20 '22 03:09

Ross Patterson


If speed is really important, I just ran a test of loading a large python dictionary (35MB) from a pickle vs SELECTING from a MySql table with all keys and values stored in rows:

Pickle Method:

import time, pickle
t1 = time.clock()
f = open('story_data.pickle','rb')
s = pickle.load(f)
print time.clock() - t1

MySQL Method:

import database as db
t1 = time.clock()
data,msg = db.mysql(""" SELECT id,story from story_data;""")
data_dict = dict([(int(x),y.split(',')) for x,y in data])
print time.clock() - t1

Output: pickle method: 32.0785171704 mysql method: 3.25916336479

If a ten-fold speed enhancement is enough, the structure of the database probably doesn't matter. Note I am splitting all the comma separated data into lists as the values for 36,000 keys and it still only takes 3 seconds. So I've switched away from using pickles for large data sets, as the rest of the 400 line program I was using took about 3 seconds, and the pickle loading took 32 seconds.

Also note:

cPickle works just like pickle and is over 50% faster.

Don't try to pickle a class full of dictionaries and save in mysql: It doesn't reconstitute itself correctly, at least it didn't for me.

like image 26
Marc Maxmeister Avatar answered Sep 21 '22 03:09

Marc Maxmeister