I'm writing a Python generator which looks like "cat". My specific use case is for a "grep like" operation. I want it to be able to break out of the generator if a condition is met:
summary={}
for fn in cat("filelist.dat"):
for line in cat(fn):
if line.startswith("FOO"):
summary[fn] = line
break
So when break
happens, I need the cat()
generator to finish and close the file handle to fn
.
I have to read 100k files with 30 GB of total data, and the FOO
keyword happens in the header region, so it is important in this case that the cat()
function stops reading the file ASAP.
There are other ways I can solve this problem, but I'm still interested to know how to get an early exit from a generator which has open file handles. Perhaps Python cleans them up right away and closes them when the generator is garbage collected?
Thanks,
Ian
Generators have a close
method that raises GeneratorExit
at the yield
statement. If you specifically catch this exception, you can run some tear-down code:
import contextlib
with contextlib.closing( cat( fn ) ):
...
and then in cat
:
try:
...
except GeneratorExit:
# close the file
If you'd like a simpler way to do this (without using the arcane close
method on generators), just make cat
take a file-like object instead of a string to open, and handle the file IO yourself:
for filename in filenames:
with open( filename ) as theFile:
for line in cat( theFile ):
...
However, you basically don't need to worry about any of this, because the garbage collection will handle it all. Still,
explicit is better than implicit
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With