I have a dict containing several pandas Dataframe (identified by keys) , any suggestion to effectively serialize (and cleanly load) it . Here is the structure (a pprint display output ). Each of dict['method_x_']['meas_x_'] is a pandas Dataframe. The goal is to save the dataframes for a further plotting with some specific plotting options.
{'method1':
{'meas1':
config1 config2 0 0.193647 0.204673 1 0.251833 0.284560 2 0.227573 0.220327,
'meas2':
config1 config2 0 0.172787 0.147287 1 0.061560 0.094000 2 0.045133 0.034760,
'method2':
{ 'meas1':
congif1 config2 0 0.193647 0.204673 1 0.251833 0.284560 2 0.227573 0.220327,
'meas2':
config1 config2 0 0.172787 0.147287 1 0.061560 0.094000 2 0.045133 0.034760}}
Use pickle.dump(s) and pickle.load(s). It actually works. Pandas DataFrames also have their own method df.save("filename") that you can use to serialize a single DataFrame...
In my particular use case, I tried to do a simple pickle.dump(all_df, open("all_df.p","wb"))
And while it loaded properly with> all_df = pickle.load(open("all_df.p","rb"))
When I restarted my Jupiter enviroment I would get a UnpicklingError: invalid load key, '\xef'.
One of the methods described here state that we can use HDF5 (pytables) to do the job. From their docs:
HDFStore is a dict-like object which reads and writes pandas
But it seems to be picky about the tables
version that you use. I got mine to work after a pip install --upgrade tables
and doing a runtime restart.
If you need a overall idea on how to use it:
#consider all_df as a list of dataframes
with pd.HDFStore('df_store.h5') as df_store:
for i in all_df.keys():
df_store[i] = all_df[i]
You should have a df_store.h5
file that you can convert back using the reverse process:
new_all_df = dict()
with pd.HDFStore('df_store.h5') as df_store:
for i in df_store.keys():
new_all_df[i] = df_store[i]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With