Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

subclassing file objects (to extend open and close operations) in python 3

Tags:

Suppose I want to extend the built-in file abstraction with extra operations at open and close time. In Python 2.7 this works:

class ExtFile(file):     def __init__(self, *args):         file.__init__(self, *args)         # extra stuff here      def close(self):         file.close(self)         # extra stuff here 

Now I'm looking at updating the program to Python 3, in which open is a factory function that might return an instance of any of several different classes from the io module depending on how it's called. I could in principle subclass all of them, but that's tedious, and I'd have to reimplement the dispatching that open does. (In Python 3 the distinction between binary and text files matters rather more than it does in 2.x, and I need both.) These objects are going to be passed to library code that might do just about anything with them, so the idiom of making a "file-like" duck-typed class that wraps the return value of open and forwards necessary methods will be most verbose.

Can anyone suggest a 3.x approach that involves as little additional boilerplate as possible beyond the 2.x code shown?

like image 223
zwol Avatar asked Apr 18 '13 14:04

zwol


2 Answers

You could just use a context manager instead. For example this one:

class SpecialFileOpener:     def __init__ (self, fileName, someOtherParameter):         self.f = open(fileName)         # do more stuff         print(someOtherParameter)     def __enter__ (self):         return self.f     def __exit__ (self, exc_type, exc_value, traceback):         self.f.close()         # do more stuff         print('Everything is over.') 

Then you can use it like this:

>>> with SpecialFileOpener('C:\\test.txt', 'Hello world!') as f:         print(f.read())  Hello world! foo bar Everything is over. 

Using a context block with with is preferred for file objects (and other resources) anyway.

like image 137
poke Avatar answered Nov 13 '22 12:11

poke


tl;dr Use a context manager. See the bottom of this answer for important cautions about them.


Files got more complicated in Python 3. While there are some methods that can be used on normal user classes, those methods don't work with built-in classes. One way is to mix-in a desired class before instanciating it, but this requires knowing what the mix-in class should be first:

class MyFileType(???):     def __init__(...)         # stuff here     def close(self):         # more stuff here 

Because there are so many types, and more could possibly be added in the future (unlikely, but possible), and we don't know for sure which will be returned until after the call to open, this method doesn't work.

Another method is to change both our custom type to have the returned file's ___bases__, and modifying the returned instance's __class__ attribute to our custom type:

class MyFileType:     def close(self):         # stuff here  some_file = open(path_to_file, '...') # ... = desired options MyFileType.__bases__ = (some_file.__class__,) + MyFile.__bases__ 

but this yields

Traceback (most recent call last):   File "<stdin>", line 1, in <module> TypeError: __bases__ assignment: '_io.TextIOWrapper' deallocator differs from 'object' 

Yet another method that could work with pure user classes is to create the custom file type on the fly, directly from the returned instance's class, and then update the returned instance's class:

some_file = open(path_to_file, '...') # ... = desired options  class MyFile(some_file.__class__):     def close(self):         super().close()         print("that's all, folks!")  some_file.__class__ = MyFile 

but again:

Traceback (most recent call last):   File "<stdin>", line 1, in <module> TypeError: __class__ assignment: only for heap types 

So, it looks like the best method that will work at all in Python 3, and luckily will also work in Python 2 (useful if you want the same code base to work on both versions) is to have a custom context manager:

class Open(object):     def __init__(self, *args, **kwds):         # do custom stuff here         self.args = args         self.kwds = kwds     def __enter__(self):         # or do custom stuff here :)         self.file_obj = open(*self.args, **self.kwds)         # return actual file object so we don't have to worry         # about proxying         return self.file_obj     def __exit__(self, *args):         # and still more custom stuff here         self.file_obj.close()         # or here 

and to use it:

with Open('some_file') as data:     # custom stuff just happened     for line in data:         print(line) # data is now closed, and more custom stuff # just happened 

An important point to keep in mind: any unhandled exception in __init__ or __enter__ will prevent __exit__ from running, so in those two locations you still need to use the try/except and/or try/finally idioms to make sure you don't leak resources.

like image 32
Ethan Furman Avatar answered Nov 13 '22 13:11

Ethan Furman