Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to cPickle dump and load separate dictionaries to the same file?

I have a process which runs and creates three dictionaries: 2 rather small, and 1 large.

I know I can store one dictionary like:

import cPickle as pickle
with open(filename, 'wb') as fp:
  pickle.dump(self.fitResults, fp)

What I'd like to do is store all 3 dictionaries in the same file, with the ability to load in the three dictionaries separately at another time. Something like

with open(filename, 'rb') as fp:
  dict1, dict2, dict3 = pickle.load(fp)

Or even better just load the first two dictionaries, and make it optional whether to load the third (large) one. Is this possible or should I go about this in a completely different way?

like image 557
JBWhitmore Avatar asked Jul 25 '12 01:07

JBWhitmore


3 Answers

Sure, you just dump each one separately and then load them separately:

with open(filename,'wb') as fp:
    pickle.dump(dict1,fp)
    pickle.dump(dict2,fp)
    pickle.dump(dict3,fp)

with open(filename,'rb') as fp:
    d1=pickle.load(fp)
    d2=pickle.load(fp)
    d3=pickle.load(fp)

make sure to dump the big on last so you can load the little ones without loading the big one first. I imagine you could even get clever and store the file positions where each dump starts in a header of sorts and then you could seek to that location before loading (but that's starting to get a little more complicated).

like image 161
mgilson Avatar answered Oct 17 '22 03:10

mgilson


I recommend the oft forgotten shelve module which effectively provides you with a persistent dictionary backed by Berkley DB file or dbm file (as selected by anydbm). The db should provide performance improvements (for your big dictionary).

Example usage:

import shelve
shelf = shelve.open('my_shelf')
>>> shelf
{}

# add your dictionaries (or any pickleable objects)
shelf['dict1'] = dict(a=10, b=20, c=30, l=[10, 20, 30])
shelf['dict2'] = dict(a=100, b=200, c=300, l=[100, 200, 300])
shelf['dict3'] = dict(a=1000, b=2000, c=3000, l=[1000, 2000, 3000])

>>> shelf
{'dict1': {'a': 10, 'c': 30, 'b': 20, 'l': [10, 20, 30]}, 'dict3': {'a': 1000, 'c': 3000, 'b': 2000, 'l': [1000, 2000, 3000]}, 'dict2': {'a': 100, 'c': 300, 'b': 200, 'l': [100, 200, 300]}}
shelf.close()

# then, later
shelf = shelve.open('my_shelf')
>>> shelf
{'dict1': {'a': 10, 'c': 30, 'b': 20, 'l': [10, 20, 30]}, 'dict3': {'a': 1000, 'c': 3000, 'b': 2000, 'l': [1000, 2000, 3000]}, 'dict2': {'a': 100, 'c': 300, 'b': 200, 'l': [100, 200, 300]}}
like image 36
mhawke Avatar answered Oct 17 '22 04:10

mhawke


As mentioned here, you can pickle several objects into the same file, and load them all (in the same order):

f = file(filename, 'wb')
for obj in [dict1, dict2, dict3]:
    cPickle.dump(obj, f, protocol=cPickle.HIGHEST_PROTOCOL)
f.close()

Then:

f = file(filename, 'rb')
loaded_objects = []
for i in range(3):
    loaded_objects.append(cPickle.load(f))
f.close()

You can save your dictionaries in a specific order so that while loading them, you've the option to select only the preferred ones.

For e.g, if you store dictionaries in the order: smallDict1, smallDict2, largeDict1
You can load only the small ones by setting appropriate range while loading
(Here for i in range(2) ...)

like image 3
Tejas Shah Avatar answered Oct 17 '22 04:10

Tejas Shah