Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Skip URL if timeout

I have a list of URL's

I am using the following to retrieve their contents:

for url in url_list:
    req = urllib2.Request(url)
    resp = urllib2.urlopen(req, timeout=5)
    resp_page = resp.read()
    print resp_page

When there is a timeout, the program just crashes. I just want to read the next URL if there is a socket.timeout: timed out. How to do this?

Thanks

like image 715
bdhar Avatar asked May 09 '26 15:05

bdhar


2 Answers

Although there already is an answer, I'd like to point out that URLlib2 might not be the sole responsible with this behavior.

As pointed out here (and as it also seems based on the problem description), the exception may belong to the socket library.

In that case just add another except:

import socket

try:
    resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
    print "Bad URL or timeout"
except socket.timeout:
    print "socket timeout"
like image 88
Jir Avatar answered May 11 '26 05:05

Jir


I'm going to go ahead and assume that by "crashes" you mean "raises a URLError", as described by the urllib2.urlopen docs. See the Errors and Exceptions section of the Python Tutorial.

for url in url_list:
    req = urllib2.Request(url)
    try:
        resp = urllib2.urlopen(req, timeout=5)
    except urllib2.URLError:
        print "Bad URL or timeout"
        continue # skips to the next iteration of the loop
    resp_page = resp.read()
    print resp_page
like image 34
agf Avatar answered May 11 '26 04:05

agf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!