Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can h5py load a file from a byte array in memory?

Tags:

hdf5

h5py

My python code is receiving a byte array which represents the bytes of the hdf5 file.

I'd like to read this byte array to an in-memory h5py file object without first writing the byte array to disk. This page says that I can open a memory mapped file, but it would be a new, empty file. I want to go from byte array to in-memory hdf5 file, use it, discard it and not to write to disk at any point.

Is it possible to do this with h5py? (or with hdf5 using C if that is the only way)

like image 745
mahonya Avatar asked May 20 '13 16:05

mahonya


People also ask

What is a h5py file?

The h5py package is a Pythonic interface to the HDF5 binary data format. HDF5 lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.

How do HDF5 files work?

The Hierarchical Data Format version 5 (HDF5), is an open source file format that supports large, complex, heterogeneous data. HDF5 uses a "file directory" like structure that allows you to organize data within the file in many different structured ways, as you might do with files on your computer.

Can HDF5 store strings?

Storing stringsYou can use string_dtype() to explicitly specify any HDF5 string datatype.

Why are HDF5 files so large?

This is probably due to your chunk layout - the more chunk sizes are small the more your HDF5 file will be bloated. Try to find an optimal balance between chunk sizes (to solve your use-case properly) and the overhead (size-wise) that they introduce in the HDF5 file.


1 Answers

You could try to use Binary I/O to create a File object and read it via h5py:

f = io.BytesIO(YOUR_H5PY_STREAM)
h = h5py.File(f,'r')
like image 143
Ümit Avatar answered Oct 26 '22 18:10

Ümit