What I need to accomplish:
Given a binary file, decode it in a couple different ways providing a TextIOBase
API. Ideally these subsequent files can get passed on without my needing to keep track of their lifespan explicitly.
Unfortunately, wrapping a BufferedReader
will
result in that reader being closed when the TextIOWrapper
goes out of scope.
Here is a simple demo of this:
In [1]: import io
In [2]: def mangle(x):
...: io.TextIOWrapper(x) # Will get GCed causing __del__ to call close
...:
In [3]: f = io.open('example', mode='rb')
In [4]: f.closed
Out[4]: False
In [5]: mangle(f)
In [6]: f.closed
Out[6]: True
I can fix this in Python 3 by overriding __del__
(this is a reasonable solution for my use case as I have complete control over the decoding process, I just need to expose a very uniform API at the end):
In [1]: import io
In [2]: class MyTextIOWrapper(io.TextIOWrapper):
...: def __del__(self):
...: print("I've been GC'ed")
...:
In [3]: def mangle2(x):
...: MyTextIOWrapper(x)
...:
In [4]: f2 = io.open('example', mode='rb')
In [5]: f2.closed
Out[5]: False
In [6]: mangle2(f2)
I've been GC'ed
In [7]: f2.closed
Out[7]: False
However this does not work in Python 2:
In [7]: class MyTextIOWrapper(io.TextIOWrapper):
...: def __del__(self):
...: print("I've been GC'ed")
...:
In [8]: def mangle2(x):
...: MyTextIOWrapper(x)
...:
In [9]: f2 = io.open('example', mode='rb')
In [10]: f2.closed
Out[10]: False
In [11]: mangle2(f2)
I've been GC'ed
In [12]: f2.closed
Out[12]: True
I've spent a bit of time staring at the Python source code and it looks remarkably similar between 2.7 and 3.4 so I don't understand why the __del__
inherited from IOBase
is not overridable in Python 2 (or even visible in dir
), but still seems to get executed. Python 3 works exactly as expected.
Is there anything I can do?
Just detach your TextIOWrapper()
object before letting it be garbage collected:
def mangle(x):
wrapper = io.TextIOWrapper(x)
wrapper.detach()
The TextIOWrapper()
object only closes streams it is attached to. If you can't alter the code where the object goes out of scope, then simply keep a reference to the TextIOWrapper()
object locally and detach at that point.
If you must subclass TextIOWrapper()
, then just call detach()
in the __del__
hook:
class DetachingTextIOWrapper(io.TextIOWrapper):
def __del__(self):
self.detach()
EDIT:
Just call detach
first, thanks martijn-pieters!
It turns out there is basically nothing that can be done about the deconstructor calling close
in Python 2.7. This is hardcoded into the C code. Instead we can modify close
such that it won't close the buffer when __del__
is happening (__del__
will be executed before _PyIOBase_finalize
in the C code giving us a chance to change the behaviour of close
). This lets close
work as expected without letting the GC close the buffer.
class SaneTextIOWrapper(io.TextIOWrapper):
def __init__(self, *args, **kwargs):
self._should_close_buffer = True
super(SaneTextIOWrapper, self).__init__(*args, **kwargs)
def __del__(self):
# Accept the inevitability of the buffer being closed by the destructor
# because of this line in Python 2.7:
# https://github.com/python/cpython/blob/2.7/Modules/_io/iobase.c#L221
self._should_close_buffer = False
self.close() # Actually close for Python 3 because it is an override.
# We can't call super because Python 2 doesn't actually
# have a `__del__` method for IOBase (hence this
# workaround). Close is idempotent so it won't matter
# that Python 2 will end up calling this twice
def close(self):
# We can't stop Python 2.7 from calling close in the deconstructor
# so instead we can prevent the buffer from being closed with a flag.
# Based on:
# https://github.com/python/cpython/blob/2.7/Lib/_pyio.py#L1586
# https://github.com/python/cpython/blob/3.4/Lib/_pyio.py#L1615
if self.buffer is not None and not self.closed:
try:
self.flush()
finally:
if self._should_close_buffer:
self.buffer.close()
My previous solution here used _pyio.TextIOWrapper
which is slower than the above because it is written in Python, not C.
It involved simply overriding __del__
with a noop which will also work in Py2/3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With