Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to dereference HDF5 references in Python?

Tags:

python

hdf5

Sometimes I get the following arrays from my HDF5 file:

val1 = {ndarray} [<HDF5 object reference> <HDF5 object reference> <HDF5 object reference>]

If I try to dereference it with HDF5 file object

f[val[0]]

I get an error

Argument 'ref' has incorrect type (expected h5py.h5r.Reference, got numpy.object_)
like image 255
Suzan Cioc Avatar asked Feb 18 '16 19:02

Suzan Cioc


1 Answers

I've come across this question while trying to answer what turned out to be basically the same question in another form. A dataset containing references to other objects is a bit of an awkward situation in HDF5, but you can actually read them in a pretty straightforward way. The idea is to get the name of the referenced object, and then just read that object directly from the file.

Given a single HDF5 reference, ref, and a file, file, you can return the name of the referenced dataset by doing:

>>> name = h5py.h5r.get_name(ref, file.id)

Then just read the actual dataset itself, as usual:

>>> data = file[name].value # ndarray with the data in it.

So to read all the referenced datasets, just map this process across the whole dataset of references.

like image 80
bnaecker Avatar answered Sep 18 '22 15:09

bnaecker