Does h5py read the whole file into the memory?
If so, what if I have a very very big file?
If not, will it be quite slow if I take data from hard disk every time I want a single data? How can I make it faster?
The Hierarchical Data Format version 5 (HDF5), is an open source file format that supports large, complex, heterogeneous data. HDF5 uses a "file directory" like structure that allows you to organize data within the file in many different structured ways, as you might do with files on your computer.
This is probably due to your chunk layout - the more chunk sizes are small the more your HDF5 file will be bloated. Try to find an optimal balance between chunk sizes (to solve your use-case properly) and the overhead (size-wise) that they introduce in the HDF5 file.
The h5py package is a Pythonic interface to the HDF5 binary data format. HDF5 lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.
Reading HDF5 files To open and read data we use the same File method in read mode, r. To see what data is in this file, we can call the keys() method on the file object. We can then grab each dataset we created above using the get method, specifying the name. This returns a HDF5 dataset object.
Does h5py read the whole file into the memory?
No, it does not. In particular, slicing (dataset[50:100]
) allows you to load fractions of a dataset into memory. For details, see the h5py docs.
If not, will it be quite slow if I take data from hard disk every time I want a single data?
In general, hdf5 is very fast. But reading from memory is obviously faster than reading from disk. It's your decision how much of a dataset is read into memory (dataset[:]
loads the whole dataset).
How can I make it faster?
If you care to optimize performance, you should read the sections about chunking and compression. There's also a book that explains these things in detail (disclaimer: I'm not the author).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With