Sorry if this is a very basic question on <code>h5py</code>. I was reading the documentation, but I didn't find a similar example. I'm trying to create multiple hdf5 datasets with Python, but it turns out after I close the file data will be overwritten. Let's say I do the following: <pre class="prettyprint"><code>import numpy as np import h5py f = h5py.File('test.hdf5', 'w') f.create_dataset('data1', data = np.ones(10)) f.close() f = h5py.File('test.hdf5', 'w') f.create_dataset('data0', data = np.zeros(10)) f.close() f = h5py.File('test.hdf5', 'r') f["data1"].value f.close() </code></pre> I get <blockquote> KeyError: "Unable to open object (Object 'data1' doesn't exist)" </blockquote> If I append data, that requires first opening in <code>'w'</code> mode and then appending in <code>'a'</code> mode, having two different statements. <pre class="prettyprint"><code>import numpy as np import h5py f = h5py.File('test.hdf5', 'w') f.create_dataset('data1', data = np.ones(10)) f.close() f = h5py.File('test.hdf5', 'a') f.create_dataset('data0', data = np.zeros(10)) f.close() f = h5py.File('test.hdf5', 'r') f["data1"].value f.close() </code></pre> If I open the file in <code>'a'</code> mode in both cases: <pre class="prettyprint"><code>import numpy as np import h5py f = h5py.File('test.hdf5', 'a') f.create_dataset('data1', data = np.ones(10)) f.close() f = h5py.File('test.hdf5', 'a') f.create_dataset('data0', data = np.zeros(10)) f.close() f = h5py.File('test.hdf5', 'r') print(f['data1'].value) f.close() </code></pre> <blockquote> RuntimeError: Unable to create link (Name already exists) </blockquote> According to the documentation, data should be stored contiguously, but I didn't find how to avoid overwriting data. How can I store data on a previously closed hdf5 only using one single statement?

If you want to create a unique file in each run, then you should consider naming the file like that , an example would be to add the timestamp to the name of the file, A very simply example would be to use <code>datetime</code> module and <code>now</code> and <code>strftime</code> method to create the file name. Example - <pre class="prettyprint"><code>import datetime filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S")) </code></pre> Then you can use that filename to open the file. <hr> Demo - <pre class="prettyprint"><code>>>> import datetime >>> filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S")) >>> filename 'test_2015_08_09_13_33_43.hdf5' </code></pre>

How to write hdf5 files without overwriting?

Sorry if this is a very basic question on h5py.

I was reading the documentation, but I didn't find a similar example.

I'm trying to create multiple hdf5 datasets with Python, but it turns out after I close the file data will be overwritten.

Let's say I do the following:

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
f["data1"].value
f.close()

I get

KeyError: "Unable to open object (Object 'data1' doesn't exist)"

If I append data, that requires first opening in 'w' mode and then appending in 'a' mode, having two different statements.

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'w')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
f["data1"].value
f.close()

If I open the file in 'a' mode in both cases:

import numpy as np
import h5py
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data1', data = np.ones(10))
f.close()
f = h5py.File('test.hdf5', 'a')
f.create_dataset('data0', data = np.zeros(10))
f.close()
f = h5py.File('test.hdf5', 'r')
print(f['data1'].value)
f.close()

RuntimeError: Unable to create link (Name already exists)

According to the documentation, data should be stored contiguously, but I didn't find how to avoid overwriting data.

How can I store data on a previously closed hdf5 only using one single statement?

How are HDF5 files structured?

HDF5 files are organized in a hierarchical structure, with two primary structures: groups and datasets. HDF5 group: a grouping structure containing instances of zero or more groups or datasets, together with supporting metadata. HDF5 dataset: a multidimensional array of data elements, together with supporting metadata.

Why is HDF5 file so large?

This is probably due to your chunk layout - the more chunk sizes are small the more your HDF5 file will be bloated. Try to find an optimal balance between chunk sizes (to solve your use-case properly) and the overhead (size-wise) that they introduce in the HDF5 file.

Can HDF5 store strings?

Encodings. HDF5 supports two string encodings: ASCII and UTF-8.

If you want to create a unique file in each run, then you should consider naming the file like that , an example would be to add the timestamp to the name of the file, A very simply example would be to use datetime module and now and strftime method to create the file name. Example -

import datetime
filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S"))

Then you can use that filename to open the file.

Demo -

>>> import datetime
>>> filename = "test_{}.hdf5".format(datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S"))
>>> filename
'test_2015_08_09_13_33_43.hdf5'

How to write hdf5 files without overwriting?

Tags:

python

h5py

ilciavo

People also ask

1 Answers

Anand S Kumar

Recent Activity

Donate For Us

How to write hdf5 files without overwriting?

Tags:

python

h5py

ilciavo

People also ask

1 Answers

Anand S Kumar

Related questions

Recent Activity

Donate For Us