Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to know when to manage resources in Python

I hope I framed the question right. I am trying to force myself to be a better programmer. By better I mean efficient. I want to write a program to identify the files in a directory and read each file for further processing. After some shuffling I got to this:

for file in os.listdir(dir):
    y=open(dir+'\\'+file,'r').readlines()
    for line in y:
        pass
    y.close()

It should be no surprise that I get an AttributeError since y is a list. I didn't think about that when I wrote the snippet.

I am thinking about this and am afraid that I have five open files (there are five files in the directory specified by dir.

I can fix the code so it runs and I explicitly close the files after opening them. I am curious if I need to or if Python handles closing the file in the next iteration of the loop. If so then I only need to write:

for file in os.listdir(dir):
    y=open(dir+'\\'+file,'r').readlines()
    for line in y:
        pass

I am guessing that it(python) does handle this effortlessly. The reason I think that this might be handled is that I have changed the object/thing that y is referencing. When I start the second iteration there are no more memory references to the file that was opened and read using the readlines method.

like image 960
PyNEwbie Avatar asked Dec 09 '22 21:12

PyNEwbie


2 Answers

Python will close open files when they get garbage-collected, so generally you can forget about it -- particularly when reading.

That said, if you want to close explicitely, you could do this:

for file in os.listdir(dir):
    f = open(dir+'\\'+file,'r')
    y = f.readlines()
    for line in y:
        pass
    f.close()

However, we can immediately improve this, because in python you can iterate over file-like objects directly:

for file in os.listdir(dir):
    y = open(dir+'\\'+file,'r')
    for line in y:
        pass
    y.close()

Finally, in recent python, there is the 'with' statement:

for file in os.listdir(dir):
    with open(dir+'\\'+file,'r') as y:
        for line in y:
            pass

When the with block ends, python will close the file for you and clean it up.

(you also might want to look into os.path for more pythonic tools for manipulating file names and directories)

like image 99
John Fouhy Avatar answered Dec 29 '22 00:12

John Fouhy


Don't worry about it. Python's garbage collector is good, and I've never had a problem with not closing file-pointers (for read operations at least)

If you did want to explicitly close the file, just store the open() in one variable, then call readlines() on that, for example..

f = open("thefile.txt")
all_lines = f.readlines()
f.close()

Or, you can use the with statement, which was added in Python 2.5 as a from __future__ import, and "properly" added in Python 2.6:

from __future__ import with_statement # for python 2.5, not required for >2.6

with open("thefile.txt") as f:
    print f.readlines()

# or

the_file = open("thefile.txt")
with the_file as f:
    print f.readlines()

The file will automatically be closed at the end of the block.

..but, there are other more important things to worry about in the snippets you posted, mostly stylistic things.

Firstly, try to avoid manually constructing paths using string-concatenation. The os.path module contains lots of methods to do this, in a more reliable, cross-platform manner.

import os
y = open(os.path.join(dir, file), 'r')

Also, you are using two variable names, dir and file - both of which are built-in functions. Pylint is a good tool to spot things like this, in this case it would give the warning:

[W0622] Redefining built-in 'file'
like image 33
dbr Avatar answered Dec 28 '22 23:12

dbr