[I apologize for the inept title; I could not come up with anything better. Suggestions for a better title are welcome.]
I want to implement an interface to HDF5 files that supports multiprocess-level concurrency through file-locking. The intended environment for this module is a Linux cluster accessing a shared disk over NFS. The goal is to enable the concurrent access (over NFS) to the same file by multiple parallel processes running on several different hosts.
I would like to be able to implement the locking functionality through a wrapper class for the h5py.File
class. (h5py
already offers support for thread-level concurrency, but the underlying HDF5 library is not thread-safe.)
It would be great if I could do something in the spirit of this:
class LockedH5File(object):
def __init__(self, path, ...):
...
with h5py.File(path, 'r+') as h5handle:
fcntl.flock(fcntl.LOCK_EX)
yield h5handle
# (method returns)
I realize that the above code is wrong, but I hope it conveys the main idea: namely, to have the expression LockedH5File('/path/to/file')
deliver an open handle to the client code, which can then perform various arbitrary read/write operations on it. When this handle goes out of scope, its destructor closes the handle, thereby releasing the lock.
The goal that motivates this arrangement is two-fold:
decouple the creation of the handle (by the library code) from the operations that are subsequently requested on the handle (by the client code), and
ensure that the handle is closed and the lock released, no matter what happens during the execution of the intervening code (e.g. exceptions, unhandled signals, Python internal errors).
How can I achieve this effect in Python?
Thanks!
The with statement in Python is used for resource management and exception handling. You'd most likely find it when working with file streams. For example, the statement ensures that the file stream process doesn't block other processes if an exception is raised, but terminates properly.
with statement in Python is used in exception handling to make the code cleaner and much more readable. It simplifies the management of common resources like file streams.
objects that can be used in with
statements are called context managers; and they implement a simple interface. They must provide two methods, an __enter__
method, which takes no arguments and may return anything (which will be assigned to the variable in the as
portion), and an __exit__
method, which takes three arguments (which will be filled in with the result of sys.exc_info()
) and returns non-zero to indicate that an exception was handled. Your example will probably look like:
class LockedH5File(object):
def __init__(self, path, ...):
self.path = path
def __enter__(self):
self.h5handle = h5handle = h5py.File(self.path, 'r+')
fcntl.flock(fcntl.LOCK_EX)
return h5handle
def __exit__(self, exc_type, exc_info, exc_tb):
self.h5handle.close()
To make this work, your class needs to implement the context manager protocol. Alternatively, write a generator function using the contextlib.contextmanager
decorator.
Your class might roughly look like this (the details of h5py
usage are probably horribly wrong):
class LockedH5File(object):
def __init__(self, path, ...):
self.h5file = h5py.File(path, 'r+')
def __enter__(self):
fcntl.flock(fcntl.LOCK_EX)
return self.h5file
def __exit__(self, exc_type, exc_val, exc_tb):
self.h5file.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With