Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python 3, errorhandling urllib requests

from difflib import *
import urllib.request,urllib.parse,urllib.error
from urllib.parse import unquote
import time
import pdb

try:
    file2 = urllib.request.Request('site goes here')
    file2.add_header("User-Agent", 'Opera/9.61 (Windows NT 5.1; U; en) Presto/2.1.1')
    ResponseData = urllib.request.urlopen(file2).read().decode("utf8", 'ignore')
except urllib.error.URLError as e: print('http'); ResponseData = ''
except socket.error as e: ResponseData = ''
except socket.timeout as e: ResponseData = ''
except UnicodeEncodeError as e: ResponseData = ''
except http.client.BadStatusLine as e: ResponseData = ''
except http.client.IncompleteRead as e: ResponseData = ''
except urllib.error.HTTPError as e: ResponseData = ''

Hi, when I run the following code on a page containing errors such as 'Microsoft VBScript runtime error' ... the request fails and returns as urllib.error.URLError ... even though the page contains plenty of other code. How can I return ALL the html from the page and not just the exception error. I would like to keep my current code as much as possible (if that is possible). Thanks

like image 445
Rhys Avatar asked Aug 18 '12 23:08

Rhys


People also ask

Which is better Urllib or requests?

Requests - Requests' is a simple, easy-to-use HTTP library written in Python. 1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib. encode() to encode the parameters before passing them.

Is requests faster than Urllib?

I found that time took to send the data from the client to the server took same time for both modules (urllib, requests) but the time it took to return data from the server to the client is more then twice faster in urllib compare to request. I'm working on localhost.

How do I fix Error 403 Forbidden in Python?

You can try the following steps in order to resolve the 403 error in the browser try refreshing the page, rechecking the URL, clearing the browser cookies, check your user credentials.


2 Answers

thank you, I have solved the problem

except urllib.error.URLError as e: ResponseData = e.read().decode("utf8", 'ignore')
like image 106
Rhys Avatar answered Sep 19 '22 04:09

Rhys


URLError has a 'reason' property, so you can call:

except urllib.error.URLError as e: ResponseData = e.reason

(For example, this would be 'Forbidden').

You should also be careful with catching the subclass of errors before their superclass. In your example, this would mean putting HTTPError before URLError. Otherwise, the subclass will never get caught.

like image 35
mmagician Avatar answered Sep 20 '22 04:09

mmagician