I have been using scipy.io to save my structured data (lists and dictionaries filled with ndarrays in different shapes). Since v7.3 mat file is going to replace the old v7 mat format some day, I am thinking about switching to HDF5 to store my data, more specifically h5py for python. However, I noticed that I cannot save my dictionaries as easy as:
import scipy.io as sio
data = {'data': 'Complicated structure data'}
sio.savemat('fileName.mat', data)
Instead, I have to use h5py.create_group one by one to replicated the structure in python dictionary. For very large structures, this is unfeasible. Is there an easy way to automatically convert python dictionaries to hdf5 groups?
Thank you!
-Shawn
The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.
Within one HDF5 file, you can store a similar set of data organized in the same way that you might organize files and folders on your computer. However in a HDF5 file, what we call "directories" or "folders" on our computers, are called groups and what we call files on our computer are called datasets .
Creating HDF5 files The first step to creating a HDF5 file is to initialise it. It uses a very similar syntax to initialising a typical text file in numpy. The first argument provides the filename and location, the second the mode. We're writing the file, so we provide a w for write access.
I needed to do this kind of thing all the time, and decided it would be neat to make a hdf5 version of pickle: https://github.com/telegraphic/hickle
The motivation was storing python dictionaries of numpy arrays, which sounds like what you're after:
import hickle as hkl
import numpy as np
data = {
'dataset1' : np.zeros((100,100)),
'dataset2' : np.random.random((100,100))
}
hkl.dump(data, 'output_filename.hkl')
You should be able to install it via PyPi (pip install hickle), or download it from github.
Cheers Danny
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With