Originally, I have been opening and simultaneously reading through two files with something like this:
with open(file1, 'r') as R1:
with open(file2, 'r') as R2:
### my code
But now the input file names may sometimes be gzipped. So, I thought to split up the with statement and use an if statement to handle the two scenarios with something like this:
if zipped:
R1 = gzip.open(file1, 'r')
R2 = gzip.open(file2, 'r')
else:
R1 = open(file1, 'r')
R2 = open(file2, 'r')
with R1:
with R2:
### my code
Does the second code function like the first? Or is there even a better way to do this?
What you're doing mostly makes sense, but it has one problem.
File objects are context managers that close themselves on __exit__. As the gzip docs make clear, that includes the GzipFile objects returned by gzip.open:
GzipFilesupports theio.BufferedIOBaseinterface, including iteration and thewithstatement.
So, if you write with f: on an opened regular file or GzipFile, that guarantees that close will be called after the with statement.
In Python 2.7, the details are slightly different, but it works the same way. In Python 2.6, a GzipFile was not a context manager. But there's a very easy solution (that's worth knowing about for other types, even if you don't care about Python 2.6): you can wrap anything with a close method in closing to get a context manager that calls that close on __exit__. So, you could write:
with contextlib.closing(R1):
… and it would work on R1 whether it's a file object, or some other kind of thing (like a 2.6 GzipFile) that doesn't know how to be a context manager.
However, what happens if R1 opens successfully, but R2 fails? Then you haven't even gotten into the with R1: when the exception is raised, so you never close R1.
You could fix this by doing the with R1 before opening R2:
if zipped:
R1 = gzip.open(file1, 'r')
else:
R1 = open(file1, 'r')
with R1:
if zipped:
R2 = gzip.open(file2, 'r')
else:
R2 = open(file2, 'r')
with R2:
Or you could use an ExitStack.
But there's a much simpler solution here: Both gzip.open and open are callable objects, so you can store them in a variable, and call it later. Since they have the same signature, and you want to call them with the exact same arguments, using that variable is trivial:
if zipped:
zopen = gzip.open
else:
zopen = open
with zopen(file1, 'r') as R1:
with zopen(file2, 'r') as R2:
And notice that you can make this a lot more concise without making it any less readable:
zopen = gzip.open if zipped else open
with zopen(file1, 'r') as R1, zopen(file2, 'r') as R2:
You can do it your original way by creating a function that checks what kind of file it is.
def open_or_gzip_open(file_name, permissions='r'):
if file_name.endswith('gz'):
R1 = gzip.open(file_name, 'r')
else:
R1 = open(file_name, 'r')
return R1
You can open both files on one line:
with open_or_gzip_open('text.txt') as p1, open_or_gzip_open('text2.txt') as p2:
print(p1, p2)
~
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With