Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't access returned h5py object instance

I have a very weird issue here. I have 2 functions: one which reads an HDF5 file created using h5py and one which creates a new HDF5 file which concatenates the content returned by the former function.

def read_file(filename):
    with h5py.File(filename+".hdf5",'r') as hf:

        group1 = hf.get('group1')
        group1 = hf.get('group2')            
        dataset1 = hf.get('dataset1')
        dataset2 = hf.get('dataset2')
        print group1.attrs['w'] # Works here

        return dataset1, dataset2, group1, group1

And the create file function

def create_chunk(start_index, end_index):

    for i in range(start_index, end_index):
        if i == start_index:
            mergedhf = h5py.File("output.hdf5",'w')
            mergedhf.create_dataset("dataset1",dtype='float64')
            mergedhf.create_dataset("dataset2",dtype='float64')

            g1 = mergedhf.create_group('group1')
            g2 = mergedhf.create_group('group2')

    rd1,rd2,rg1,rg2 = read_file(filename)

    print rg1.attrs['w'] #gives me <Closed HDF5 group> message

    g1.attrs['w'] = "content"
    g1.attrs['x'] = "content"
    g2.attrs['y'] = "content"
    g2.attrs['z'] = "content"
    print g1.attrs['w'] # Works Here
return mergedhf.get('dataset1'), mergedhf.get('dataset2'), g1, g2

def calling_function():
    wd1, wd2, wg1, wg2 = create_chunk(start_index, end_index)
    print wg1.attrs['w'] #Works here as well

Now the problem is, the dataset and the properties from the new file created and represented by wd1, wd2, wg1 and wg2 can be accessed by me and I can access the attribute data but i cant do the same for which I have read and returned the value for.

Can anyone help me fetch the values of the dataset and group when I have returned the reference to the calling function?

like image 612
Biplob Biswas Avatar asked Apr 15 '16 16:04

Biplob Biswas


People also ask

How do I open an H5 file?

Open a HDF5/H5 file in HDFView To begin, open the HDFView application. Within the HDFView application, select File --> Open and navigate to the folder where you saved the NEONDSTowerTemperatureData. hdf5 file on your computer. Open this file in HDFView.

How do I open a H5 file in Python?

To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.

What is a h5py file?

The h5py package is a Pythonic interface to the HDF5 binary data format. HDF5 lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.

How do I view H5 files in Linux?

h5. It's possible to select the file driver with which to open the HDF5 file by using the --filedriver (-f) command-line option. Acceptable values for the --filedriver option are: "sec2", "family", "split", "multi", and "stream".


1 Answers

The problem is in read_file, this line:

with h5py.File(filename+".hdf5",'r') as hf:

This closes hf at the end of the with block, i.e. when read_file returns. When this happens, the datasets and groups also get closed and you can no longer access them.

There are (at least) two ways to fix this. Firstly, you can open the file like you do in create_chunk:

hf = h5py.File(filename+".hdf5", 'r')

and keep the reference to hf around as long as you need it, before closing it:

hf.close()

The other way is to copy the data from the datasets in read_file and return those instead:

dataset1 = hf.get('dataset1')[:]
dataset2 = hf.get('dataset2')[:]

Note that you can't do this with the groups. The file needs to be open for as long as you need to do things with the groups.

like image 107
Yossarian Avatar answered Sep 22 '22 13:09

Yossarian