Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Opening already opened hdf5 file in write mode, using h5py

I run the same Python program concurrently as different processes, and these all want to write to the same hdf5 file, using the h5py Python package. However, only a single process may open a given hdf5 file in write mode, otherwise you will get the error

OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')

During handling of the above exception, another exception occurred:

OSError: Unable to create file (unable to open file: name = 'test.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)

I want to resolve this by checking whether the file is already opened in write mode, and if so, wait a bit and check again, until it is no longer opened in write mode. I have not found any such checking capability of h5py or hdf5. As of now, my solution is based on this:

from time import sleep
import h5py

# Function handling the intelligent hdf5 file opening
def open_hdf5(filename, *args, **kwargs):
    while True:
        try:
            hdf5_file = h5py.File(filename, *args, **kwargs)
            break  # Success!
        except OSError:
            sleep(5)  # Wait a bit
    return hdf5_file

# How to use the function
with open_hdf5(filename, mode='a') as hdf5_file:
    # Do stuff
    ...

I'm unsure whether I like this, as it doesn't seem very gentle. Are there any better way of doing this? Are there any change that my erroneous attempts to open the file inside the try can somehow corrupt the write process that is going on in the other process?

like image 992
jmd_dk Avatar asked Mar 22 '18 21:03

jmd_dk


People also ask

How do I open an HDF5 file?

Open a HDF5/H5 file in HDFViewOpen this file in HDFView. If you click on the name of the HDF5 file in the left hand window of HDFView, you can view metadata for the file. This will be located in the bottom window of the application.

How do I open an HDF5 file in Python?

To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.

How do I open HDF5 file in Jupyter notebook?

Double clicking on an . hdf5 file in the file browser will open it in a special HDF browser. You can then browse through the groups and open the datasets in the . hdf5 file.

What is a h5py file?

The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.


1 Answers

Judging by a quick research there is no platform independent way of checking if a file is already is open write mode. How to check whether a file is_open and the open_status in python https://bytes.com/topic/python/answers/612924-how-check-whether-file-open-not

However since you have defined a wrapper open read/write methods for reading writing your hdf5 file you can always create a "file_name".lock file when you have one process that succeeded in opening the hdf5 file.

Then all you have to do is use os.path.exists('"file_name".lock') to know if you can open the file in write mode.

Essentially it is not very different for what you do. However first it's just you can look in your filesytem to see whether one of your process accesses in write mode the file, second the test is not the product of an exception since os.path.exists will return a boolean.

Many applications use this kind of trick. When roaming through CVS repo you often see .lock files lying around...

like image 109
PilouPili Avatar answered Sep 18 '22 16:09

PilouPili