Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

socket ResourceWarning using urllib in Python 3

I am using a urllib.request.urlopen() to GET from a web service I'm trying to test.

This returns an HTTPResponse object, which I then read() to get the response body.

But I always see a ResourceWarning about an unclosed socket from socket.py

Here's the relevant function:

from urllib.request import Request, urlopen

def get_from_webservice(url):
    """ GET from the webservice  """
    req = Request(url, method="GET", headers=HEADERS)
    with urlopen(req) as rsp:
        body = rsp.read().decode('utf-8')
        return json.loads(body)

Here's the warning as it appears in the program's output:

$ ./test/test_webservices.py
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socket.py:359: ResourceWarning: unclosed <socket.socket object, fd=5, family=30, type=1, proto=6>
self._sock = None
.s
----------------------------------------------------------------------
Ran 2 tests in 0.010s

OK (skipped=1)

If there's anything I can do to the HTTPResponse (or the Request?) to make it close its socket cleanly, I would really like to know, because this code is for my unit tests; I don't like ignoring warnings anywhere, but especially not there.

like image 248
scav Avatar asked Feb 18 '13 14:02

scav


2 Answers

I don't know if this is the answer, but it is part of the way to an answer.

If I add the header "connection: close" to the response from my web services, the HTTPResponse object seems to clean itself up properly without a warning.

And in fact, the HTTP Spec (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) says:

HTTP/1.1 applications that do not support persistent connections MUST include the "close" connection option in every message.

So the problem was on the server end (i.e. my fault!). In the event that you don't have control over the headers coming from the server, I don't know what you can do.

like image 101
scav Avatar answered Oct 19 '22 21:10

scav


I had the same problem with urllib3 and I just added a context manager to close connection automatically:

import urllib3

def get(addr, headers):
    """ this function will close the connection after a http request. """
    with urllib3.PoolManager() as conn:
        res = conn.request('GET', addr, headers=headers)
        if r.status == 200:
            return res.data
        else:
            raise ConnectionError(res.reason)

Note that urllib3 is designed to have a pool of connections and to keep connections alive for you. This can significantly speed up your application, if it needs to make a series of requests, e.g. few calls to the backend API.

Please read urllib3 documentation re connection pools here: https://urllib3.readthedocs.io/en/1.5/pools.html

P.S. you could also use requests lib, which is not a part of the Python standard lib (at 2019) but is very powerful and simple to use: http://docs.python-requests.org/en/master/

like image 41
Dmytro Gierman Avatar answered Oct 19 '22 20:10

Dmytro Gierman