I run the same Python program concurrently as different processes, and these all want to write to the same hdf5
file, using the h5py
Python package. However, only a single process may open a given hdf5
file in write mode, otherwise you will get the error
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
During handling of the above exception, another exception occurred:
OSError: Unable to create file (unable to open file: name = 'test.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
I want to resolve this by checking whether the file is already opened in write mode, and if so, wait a bit and check again, until it is no longer opened in write mode. I have not found any such checking capability of h5py
or hdf5
. As of now, my solution is based on this:
from time import sleep
import h5py
# Function handling the intelligent hdf5 file opening
def open_hdf5(filename, *args, **kwargs):
while True:
try:
hdf5_file = h5py.File(filename, *args, **kwargs)
break # Success!
except OSError:
sleep(5) # Wait a bit
return hdf5_file
# How to use the function
with open_hdf5(filename, mode='a') as hdf5_file:
# Do stuff
...
I'm unsure whether I like this, as it doesn't seem very gentle. Are there any better way of doing this? Are there any change that my erroneous attempts to open the file inside the try
can somehow corrupt the write process that is going on in the other process?
Open a HDF5/H5 file in HDFViewOpen this file in HDFView. If you click on the name of the HDF5 file in the left hand window of HDFView, you can view metadata for the file. This will be located in the bottom window of the application.
To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.
Double clicking on an . hdf5 file in the file browser will open it in a special HDF browser. You can then browse through the groups and open the datasets in the . hdf5 file.
The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.
Judging by a quick research there is no platform independent way of checking if a file is already is open write mode. How to check whether a file is_open and the open_status in python https://bytes.com/topic/python/answers/612924-how-check-whether-file-open-not
However since you have defined a wrapper open read/write methods for reading writing your hdf5 file you can always create a "file_name".lock file when you have one process that succeeded in opening the hdf5 file.
Then all you have to do is use os.path.exists('"file_name".lock') to know if you can open the file in write mode.
Essentially it is not very different for what you do. However first it's just you can look in your filesytem to see whether one of your process accesses in write mode the file, second the test is not the product of an exception since os.path.exists will return a boolean.
Many applications use this kind of trick. When roaming through CVS repo you often see .lock files lying around...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With