Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway

Question

I receive a 'HTTP Error 500: Internal Server Error' response, but I still want to read the data inside the error HTML.

With Python 2.6, I normally fetch a page using:

import urllib2 url = "http://google.com" data = urllib2.urlopen(url) data = data.read()

When attempting to use this on the failing URL, I get the exception urllib2.HTTPError:

urllib2.HTTPError: HTTP Error 500: Internal Server Error

How can I fetch such error pages (with or without urllib2), all while they are returning Internal Server Errors?

Note that with Python 3, the corresponding exception is urllib.error.HTTPError.

Joe Holloway · Accepted Answer

The HTTPError is a file-like object. You can catch it and then read its contents.

try:     resp = urllib2.urlopen(url)     contents = resp.read() except urllib2.HTTPError, error:     contents = error.read()

sberry · Answer

If you mean you want to read the body of the 500:

request = urllib2.Request(url, data, headers) try:         resp = urllib2.urlopen(request)         print resp.read() except urllib2.HTTPError, error:         print "ERROR: ", error.read()

In your case, you don't need to build up the request. Just do

try:         resp = urllib2.urlopen(url)         print resp.read() except urllib2.HTTPError, error:         print "ERROR: ", error.read()

so, you don't override urllib2.HTTPError, you just handle the exception.

Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway

Tags:

python

urllib

urllib2

http-error

backus

2 Answers

Joe Holloway

sberry

Recent Activity

Donate For Us

Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway

Tags:

python

urllib

urllib2

http-error

backus

2 Answers

Joe Holloway

sberry

Related questions

Recent Activity

Donate For Us