Originally, I have been opening and simultaneously reading through two files with something like this:
with open(file1, 'r') as R1:
with open(file2, 'r') as R2:
### my code
But now the input file names may sometimes be gzipped. So, I thought to split up the with
statement and use an if
statement to handle the two scenarios with something like this:
if zipped:
R1 = gzip.open(file1, 'r')
R2 = gzip.open(file2, 'r')
else:
R1 = open(file1, 'r')
R2 = open(file2, 'r')
with R1:
with R2:
### my code
Does the second code function like the first? Or is there even a better way to do this?
What you're doing mostly makes sense, but it has one problem.
File objects are context managers that close
themselves on __exit__
. As the gzip
docs make clear, that includes the GzipFile
objects returned by gzip.open
:
GzipFile
supports theio.BufferedIOBase
interface, including iteration and thewith
statement.
So, if you write with f:
on an opened regular file or GzipFile
, that guarantees that close
will be called after the with
statement.
In Python 2.7, the details are slightly different, but it works the same way. In Python 2.6, a GzipFile
was not a context manager. But there's a very easy solution (that's worth knowing about for other types, even if you don't care about Python 2.6): you can wrap anything with a close
method in closing
to get a context manager that calls that close
on __exit__
. So, you could write:
with contextlib.closing(R1):
… and it would work on R1
whether it's a file object, or some other kind of thing (like a 2.6 GzipFile
) that doesn't know how to be a context manager.
However, what happens if R1
opens successfully, but R2
fails? Then you haven't even gotten into the with R1:
when the exception is raised, so you never close R1
.
You could fix this by doing the with R1
before opening R2
:
if zipped:
R1 = gzip.open(file1, 'r')
else:
R1 = open(file1, 'r')
with R1:
if zipped:
R2 = gzip.open(file2, 'r')
else:
R2 = open(file2, 'r')
with R2:
Or you could use an ExitStack
.
But there's a much simpler solution here: Both gzip.open
and open
are callable objects, so you can store them in a variable, and call it later. Since they have the same signature, and you want to call them with the exact same arguments, using that variable is trivial:
if zipped:
zopen = gzip.open
else:
zopen = open
with zopen(file1, 'r') as R1:
with zopen(file2, 'r') as R2:
And notice that you can make this a lot more concise without making it any less readable:
zopen = gzip.open if zipped else open
with zopen(file1, 'r') as R1, zopen(file2, 'r') as R2:
You can do it your original way by creating a function that checks what kind of file it is.
def open_or_gzip_open(file_name, permissions='r'):
if file_name.endswith('gz'):
R1 = gzip.open(file_name, 'r')
else:
R1 = open(file_name, 'r')
return R1
You can open both files on one line:
with open_or_gzip_open('text.txt') as p1, open_or_gzip_open('text2.txt') as p2:
print(p1, p2)
~
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With