Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read HDF5 attributes (metadata) with Python and h5py

I have a HDF5 file with multiple folders inside. Each folder has attributes added (some call attributes "metadata"). I know how to access the keys inside a folder, but I don't know how to pull the attributes with Python's h5py package. Here are attributes from HDFView:

Folder1(800,4)
   Group size = 9
   Number of attributes = 1
        measRelTime_seconds = 201.73

I need to pull this measRelTime_seconds value. I already have a loop to read files

f = h5py.File(file,'r')
        for k,key in enumerate(f.keys()): #loop over folders
            #need to obtain measRelTime_seconds here, I guess

Thanks

like image 619
mrq Avatar asked Feb 10 '21 23:02

mrq


People also ask

How do I read my HDF5 data?

Reading HDF5 files To open and read data we use the same File method in read mode, r. To see what data is in this file, we can call the keys() method on the file object. We can then grab each dataset we created above using the get method, specifying the name. This returns a HDF5 dataset object.

What is metadata in HDF5?

An HDF5 attribute is a small metadata object describing the nature and/or intended usage of a primary data object. A primary data object may be a dataset, group, or committed datatype. Attributes are assumed to be very small as data objects go, so storing them as standard HDF5 datasets would be quite inefficient.

How do I open an HDF5 file in Python?

To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.


2 Answers

Attributes work just like groups and datasets. Use object.attrs.keys() to iterate over the attribute names. The object could be a file, group or dataset.

Here is a simple example that creates 2 attributes on 3 different objects, then reads and prints them.

arr = np.random.randn(1000)

with h5py.File('groups.hdf5', 'w') as f:
    g = f.create_group('Base_Group')
    d = g.create_dataset('default', data=arr)

    f.attrs['User'] = 'Me'
    f.attrs['OS'] = 'Windows'

    g.attrs['Date'] = 'today'
    g.attrs['Time'] = 'now'

    d.attrs['attr1'] = 1.0
    d.attrs['attr2'] = 22.2
    
    for k in f.attrs.keys():
        print(f"{k} => {f.attrs[k]}")
    for k in g.attrs.keys():
        print(f"{k} => {g.attrs[k]}")
    for k in d.attrs.keys():
        print(f"{k} => {d.attrs[k]}")

    print('*****')
    
    for k in f.attrs.keys():
        print(f"{k} => {f.attrs[k]}")
    for k in f['Base_Group'].attrs.keys():
        print(f"{k} => {f['Base_Group'].attrs[k]}")
    for k in f['Base_Group']['default'].attrs.keys():
        print(f"{k} => {f['Base_Group']['default'].attrs[k]}")
like image 185
kcw78 Avatar answered Sep 16 '22 16:09

kcw78


Ok, I find my answer. To read it You can simply check the name of the attribute as

f['Folder'].attrs.keys()

and the value can be returned with

f['Folder'].attrs['<name of the attribute>']
like image 44
mrq Avatar answered Sep 19 '22 16:09

mrq