I have read, that file opened like this is closed automatically when leaving the with block:
with open("x.txt") as f:
data = f.read()
do something with data
yet when opening from web, I need this:
from contextlib import closing
from urllib.request import urlopen
with closing(urlopen('http://www.python.org')) as page:
for line in page:
print(line)
why and what is the difference? (I am using Python3)
Within the block of code opened by “with”, our file is open, and can be read from freely. However, once Python exits from the “with” block, the file is automatically closed.
Using a with open() statement will automatically close a file once the block has completed. Not only will using a context manager free you from having to remember to close files manually, but it will also make it much easier for others reading your code to see precisely how the program is using the file.
The close() method of a file object flushes any unwritten information and closes the file object, after which no more writing can be done. Python automatically closes a file when the reference object of a file is reassigned to another file. It is a good practice to use the close() method to close a file.
You've learned why it's important to close files in Python. Because files are limited resources managed by the operating system, making sure files are closed after use will protect against hard-to-debug issues like running out of file handles or experiencing corrupted data.
The details get a little technical, so let's start with the simple version:
Some types know how to be used in a with
statement. File objects, like what you get back from open
, are an example of such a type. As it turns out, the objects that you get back from urllib.request.urlopen
, are also an example of such a type, so your second example could be written the same way as the first.
But some types don't know how to be used in a with
statement. The closing
function is designed to wrap such types—as long as they have a close
method, it will call their close
method when you exit the with
statement.
Of course some types don't know how to be used in a with
statement, and also can't be used with closing
because their cleanup method isn't named close
(or because cleaning them up is more complicated than just closing them). In that case, you need to write a custom context manager. But even that isn't usually that hard.
In technical terms:
A with
statement requires a context manager, an object with __enter__
and __exit__
methods. It will call the __enter__
method, and give you the value returned by that method in the as
clause, and it will then call the __exit__
method at the end of the with
statement.
File objects inherit from io.IOBase
, which is a context manager whose __enter__
method returns itself, and whose __exit__
calls self.close()
.
The object returned by urlopen
is (assuming an http
or https
URL) an HTTPResponse
, which, as the docs say, "can be used with a with
statement".
The closing
function:
Return a context manager that closes thing upon completion of the block. This is basically equivalent to:
@contextmanager
def closing(thing):
try:
yield thing
finally:
thing.close()
It's not always 100% clear in the docs which types are context managers and which types aren't. Especially since there's been a major drive since 3.1 to make everything that could be a context manager into one (and, for that matter, to make everything that's mostly-file-like into an actual IOBase
if it makes sense), but it's still not 100% complete as of 3.4.
You can always just try it and see. If you get an AttributeError: __exit__
, then the object isn't usable as a context manager. If you think it should be, file a bug suggesting the change. If you don't get that error, but the docs don't mention that it's legal, file a bug suggesting the docs be updated.
You don't. urlopen('http://www.python.org')
returns a context manager too:
with urlopen('http://www.python.org') as page:
This is documented on the urllib.request.urlopen()
page:
For ftp, file, and data urls and requests explicity handled by legacy
URLopener
andFancyURLopener
classes, this function returns aurllib.response.addinfourl
object which can work as context manager [...].
Emphasis mine. For HTTP responses, http.client.HTTPResponse()
object is returned, which also is a context manager:
The response is an iterable object and can be used in a with statement.
The Examples section also uses the object as a context manager:
As the python.org website uses utf-8 encoding as specified in it’s meta tag, we will use the same for decoding the bytes object.
>>> with urllib.request.urlopen('http://www.python.org/') as f: ... print(f.read(100).decode('utf-8')) ... <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm
Objects returned by open()
are context managers too; they implement the special methods object.__enter__()
and object.__exit__()
.
The contextlib.closing()
documentation uses an example with urlopen()
that is out of date; in Python 2 the predecessor for urllib.request.urlopen()
did not produce a context manager and you needed to use that tool to auto-close the connection with a context manager. This was fixed with issues 5418 and 12365, but that example was not updated. I created issue 22755 asking for a different example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With