I have a very weird issue here. I have 2 functions: one which reads an HDF5 file created using h5py and one which creates a new HDF5 file which concatenates the content returned by the former function.
def read_file(filename):
with h5py.File(filename+".hdf5",'r') as hf:
group1 = hf.get('group1')
group1 = hf.get('group2')
dataset1 = hf.get('dataset1')
dataset2 = hf.get('dataset2')
print group1.attrs['w'] # Works here
return dataset1, dataset2, group1, group1
And the create file function
def create_chunk(start_index, end_index):
for i in range(start_index, end_index):
if i == start_index:
mergedhf = h5py.File("output.hdf5",'w')
mergedhf.create_dataset("dataset1",dtype='float64')
mergedhf.create_dataset("dataset2",dtype='float64')
g1 = mergedhf.create_group('group1')
g2 = mergedhf.create_group('group2')
rd1,rd2,rg1,rg2 = read_file(filename)
print rg1.attrs['w'] #gives me <Closed HDF5 group> message
g1.attrs['w'] = "content"
g1.attrs['x'] = "content"
g2.attrs['y'] = "content"
g2.attrs['z'] = "content"
print g1.attrs['w'] # Works Here
return mergedhf.get('dataset1'), mergedhf.get('dataset2'), g1, g2
def calling_function():
wd1, wd2, wg1, wg2 = create_chunk(start_index, end_index)
print wg1.attrs['w'] #Works here as well
Now the problem is, the dataset and the properties from the new file created and represented by wd1, wd2, wg1 and wg2 can be accessed by me and I can access the attribute data but i cant do the same for which I have read and returned the value for.
Can anyone help me fetch the values of the dataset and group when I have returned the reference to the calling function?
Open a HDF5/H5 file in HDFView To begin, open the HDFView application. Within the HDFView application, select File --> Open and navigate to the folder where you saved the NEONDSTowerTemperatureData. hdf5 file on your computer. Open this file in HDFView.
To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.
The h5py package is a Pythonic interface to the HDF5 binary data format. HDF5 lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.
h5. It's possible to select the file driver with which to open the HDF5 file by using the --filedriver (-f) command-line option. Acceptable values for the --filedriver option are: "sec2", "family", "split", "multi", and "stream".
The problem is in read_file
, this line:
with h5py.File(filename+".hdf5",'r') as hf:
This closes hf
at the end of the with
block, i.e. when read_file
returns. When this happens, the datasets and groups also get closed and you can no longer access them.
There are (at least) two ways to fix this. Firstly, you can open the file like you do in create_chunk
:
hf = h5py.File(filename+".hdf5", 'r')
and keep the reference to hf
around as long as you need it, before closing it:
hf.close()
The other way is to copy the data from the datasets in read_file
and return those instead:
dataset1 = hf.get('dataset1')[:]
dataset2 = hf.get('dataset2')[:]
Note that you can't do this with the groups. The file needs to be open for as long as you need to do things with the groups.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With