Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent TextIOWrapper from closing on GC in a Py2/Py3 compatible way

What I need to accomplish:

Given a binary file, decode it in a couple different ways providing a TextIOBase API. Ideally these subsequent files can get passed on without my needing to keep track of their lifespan explicitly.

Unfortunately, wrapping a BufferedReader will result in that reader being closed when the TextIOWrapper goes out of scope.

Here is a simple demo of this:

In [1]: import io

In [2]: def mangle(x):
   ...:     io.TextIOWrapper(x) # Will get GCed causing __del__ to call close
   ...:     

In [3]: f = io.open('example', mode='rb')

In [4]: f.closed
Out[4]: False

In [5]: mangle(f)

In [6]: f.closed
Out[6]: True

I can fix this in Python 3 by overriding __del__ (this is a reasonable solution for my use case as I have complete control over the decoding process, I just need to expose a very uniform API at the end):

In [1]: import io

In [2]: class MyTextIOWrapper(io.TextIOWrapper):
   ...:     def __del__(self):
   ...:         print("I've been GC'ed")
   ...:         

In [3]: def mangle2(x):
   ...:     MyTextIOWrapper(x)
   ...:     

In [4]: f2 = io.open('example', mode='rb')

In [5]: f2.closed
Out[5]: False

In [6]: mangle2(f2)
I've been GC'ed

In [7]: f2.closed
Out[7]: False

However this does not work in Python 2:

In [7]: class MyTextIOWrapper(io.TextIOWrapper):
   ...:     def __del__(self):
   ...:         print("I've been GC'ed")
   ...:         

In [8]: def mangle2(x):
   ...:     MyTextIOWrapper(x)
   ...:     

In [9]: f2 = io.open('example', mode='rb')

In [10]: f2.closed
Out[10]: False

In [11]: mangle2(f2)
I've been GC'ed

In [12]: f2.closed
Out[12]: True

I've spent a bit of time staring at the Python source code and it looks remarkably similar between 2.7 and 3.4 so I don't understand why the __del__ inherited from IOBase is not overridable in Python 2 (or even visible in dir), but still seems to get executed. Python 3 works exactly as expected.

Is there anything I can do?

like image 998
ebolyen Avatar asked Jun 23 '15 04:06

ebolyen


2 Answers

Just detach your TextIOWrapper() object before letting it be garbage collected:

def mangle(x):
    wrapper = io.TextIOWrapper(x)
    wrapper.detach()

The TextIOWrapper() object only closes streams it is attached to. If you can't alter the code where the object goes out of scope, then simply keep a reference to the TextIOWrapper() object locally and detach at that point.

If you must subclass TextIOWrapper(), then just call detach() in the __del__ hook:

class DetachingTextIOWrapper(io.TextIOWrapper):
    def __del__(self):
        self.detach()
like image 60
Martijn Pieters Avatar answered Sep 28 '22 02:09

Martijn Pieters


EDIT:

Just call detach first, thanks martijn-pieters!


It turns out there is basically nothing that can be done about the deconstructor calling close in Python 2.7. This is hardcoded into the C code. Instead we can modify close such that it won't close the buffer when __del__ is happening (__del__ will be executed before _PyIOBase_finalize in the C code giving us a chance to change the behaviour of close). This lets close work as expected without letting the GC close the buffer.

class SaneTextIOWrapper(io.TextIOWrapper):
    def __init__(self, *args, **kwargs):
        self._should_close_buffer = True
        super(SaneTextIOWrapper, self).__init__(*args, **kwargs)

    def __del__(self):
        # Accept the inevitability of the buffer being closed by the destructor
        # because of this line in Python 2.7:
        # https://github.com/python/cpython/blob/2.7/Modules/_io/iobase.c#L221
        self._should_close_buffer = False
        self.close()  # Actually close for Python 3 because it is an override.
                      # We can't call super because Python 2 doesn't actually
                      # have a `__del__` method for IOBase (hence this
                      # workaround). Close is idempotent so it won't matter
                      # that Python 2 will end up calling this twice

    def close(self):
        # We can't stop Python 2.7 from calling close in the deconstructor
        # so instead we can prevent the buffer from being closed with a flag.

        # Based on:
        # https://github.com/python/cpython/blob/2.7/Lib/_pyio.py#L1586
        # https://github.com/python/cpython/blob/3.4/Lib/_pyio.py#L1615
        if self.buffer is not None and not self.closed:
            try:
                self.flush()
            finally:
                if self._should_close_buffer:
                    self.buffer.close()

My previous solution here used _pyio.TextIOWrapper which is slower than the above because it is written in Python, not C.

It involved simply overriding __del__ with a noop which will also work in Py2/3.

like image 41
ebolyen Avatar answered Sep 28 '22 02:09

ebolyen