Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detecting hangs with Python urllib2.urlopen

I'm using Python's urllib2 to send an HTTP post:

import socket, urllib, urllib2

socket.setdefaulttimeout(15)    

postdata = urllib.urlencode({'value1' : 'a string', 'value2' : 'another string'})
headers = {
    'User-Agent': 'Agent',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Accept': 'text/html, */*',
}

try: 
    request = urllib2.Request('http://www.example.com', postData, headers)
    response = urllib2.urlopen(request)
except urllib2.HTTPError, e:
    # Handle here
except urllib2.URLError, e:
    # Handle here
except httplib.HTTPException, e:
    # Handle here

Occasionally a network issue results in the call to urlopen never returning. We see other errors (including timeouts) handled correctly by the except block and have a call to socket.setdefaulttimeout() but there are still instances where the urlopen will never return.

I know it's never returning because we have some log lines in our actual code which get called before and after, and when this problem occurs only the calls before are made and the script hangs forever.

What's the best way to detect/handle this?

like image 714
davidmytton Avatar asked Apr 06 '11 11:04

davidmytton


People also ask

What does urllib2 Urlopen do?

What is urlopen? urllib2 offers a very simple interface, in the form of the urlopen function. Just pass the URL to urlopen() to get a “file-like” handle to the remote data. like basic authentication, cookies, proxies and so on.

What does Urllib request Urlopen do?

request is a Python module for fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols.

Does urllib2 work in Python 3?

NOTE: urllib2 is no longer available in Python 3.

What is the difference between Urllib and urllib3?

The Python 3 standard library has a new urllib which is a merged/refactored/rewritten version of the older modules. urllib3 is a third-party package (i.e., not in CPython's standard library).


1 Answers

You can use signals, first set a handler for your signal

import signal
...
def handler(signum, frame):
    print 'Signal handler called with signal', signum
...
signal.signal(signal.SIGALRM, handler)

and put an alarm just before the urlopen call

signal.alarm(5)
response = urllib2.urlopen(request)
signal.alarm(0) # Disable the signal

after 5 seconds (or the time you desire) the OS will call the handler if the alarm is not disable (if urlopen never returns). More info about signal module: http://docs.python.org/library/signal.html

like image 132
Manuel Avatar answered Sep 22 '22 08:09

Manuel