Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommended way of closing files using pathlib module?

Historically I have always used the following for reading files in python:

with open("file", "r") as f:
    for line in f:
        # do thing to line

Is this still the recommend approach? Are there any drawbacks to using the following:

from pathlib import Path

path = Path("file")
for line in path.open():
    # do thing to line

Most of the references I found are using the with keyword for opening files for the convenience of not having to explicitly close the file. Is this applicable for the iterator approach here?

with open() docs

like image 408
Ben Carley Avatar asked Apr 24 '20 09:04

Ben Carley


People also ask

What is the use of Pathlib module in Python?

The pathlib module of Python makes it very easy and efficient to deal with file paths. The os. path module can also be used to handle path name operations. The difference is that path module creates strings that represent file paths whereas pathlib creates a path object.

How does Pathlib path work?

With pathlib , file paths can be represented by proper Path objects instead of plain strings as before. These objects make code dealing with file paths: Easier to read, especially because / is used to join paths together. More powerful, with most necessary methods and properties available directly on the object.

Which is better os or Pathlib?

In this article, I have introduced another Python built-in library, the Pathlib. It is considered to be more advanced, convenient and provides more stunning features than the OS library.


Video Answer


3 Answers

Something that wasn't mentioned yet: if all you wanted to do was read or write some text (or bytes) then you no longer need to use the context manager explicitly when using pathlib:

>>> import pathlib
>>> path = pathlib.Path("/tmp/example.txt")
>>> path.write_text("hello world")
11
>>> path.read_text()
'hello world'
>>> path.read_bytes()
b'hello world'

Opening a file to iterate lines should still use a with-statement, for all the same reasons as using the context manager with open, as the docs show:

>>> with path.open() as f:
...     for line in f:
...         print(line)
...
hello world
like image 138
wim Avatar answered Oct 10 '22 10:10

wim


Keep in mind that a Path object is for working with filesystem paths. Just like the built-in library of Python, there is an open method but no close in a Path object.

The .close is in the file handle that is returned by either the built-in open or by using the Path object's open method:

>>> from pathlib import Path
>>> p=Path(some_file)
>>> p
PosixPath('/tmp/file')

You can open that Path object either with the built-in open function or the open method in the Path object:

>>> fh=open(p)    # open built-in function
>>> fh
<_io.TextIOWrapper name='/tmp/file' mode='r' encoding='UTF-8'>
>>> fh.close()

>>> fh=p.open()   # Path open method which aliases to os.open
>>> fh
<_io.TextIOWrapper name='/tmp/file' mode='r' encoding='UTF-8'>
>>> fh.close()

You can have a look at the source code for pathlib on Github as an indication of how the authors of pathlib do it in their own code.

What I observe is one of three things.

The most common by far is to use with:

from pathlib import Path 

p=Path('/tmp/file')

#create a file
with p.open(mode='w') as fi:
    fi.write(f'Insides of: {str(p)}')

# read it back and test open or closed
with p.open(mode='r') as fi:
    print(f'{fi.read()} closed?:{fi.closed}')

# prints 'Insides of: /tmp/file closed?:False'

As you likely know, at the end of the with block the __exit__ methods are called. For a file, that means the file is closed. This is the most common approach in the pathlib source code.

Second, you can also see in the source that a pathlib object maintains an entry and exit status and a flag of the file being open and closed. The os.close functions is not explicitly called however. You can check that status with the .closed accessor.

fh=p.open()
print(f'{fh.read()} closed?:{fh.closed}')
# prints Insides of: /tmp/file closed?:False    
# fi will only be closed when fi goes out of scope...
# or you could (and should) do fh.close()


with p.open() as fi:
    pass
print(f'closed?:{fi.closed}')   
# fi still in scope but implicitly closed at the end of the with bloc
# prints closed?:True

Third, on cPython, files are closed when the file handle goes out of scope. This is not portable or considered 'good practice' to rely on, but commonly it is. There are instances of this in the pathlib source code.

like image 5
dawg Avatar answered Oct 10 '22 10:10

dawg


Pathlib is object oriented way for manipulating filesystem paths.

Recommended way of opening a file using pathlib module would be using context manager:

p = Path("my_file.txt")

with p.open() as f:
    f.readline()

This ensures closing the file after it's usage.


In both examples you provided, you are not closing a files because you open them inplace.

Since p.open() returns file object, you can test this by assigning it and checking attribute closed like so:

from pathlib import Path

path = Path("file.txt")

# Open the file pointed by this path and return a file object, as
# the built-in open() function does.
f = path.open()
for line in f:
    # do some stuff

print(f.closed)  # Evaluates to False.

like image 1
Dinko Pehar Avatar answered Oct 10 '22 08:10

Dinko Pehar