Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTTP Error 403: Forbidden while downloading file using urllib

I have this line of code: urllib.request.urlretrieve('http://lolupdater.com/downloads/LPB.exe', 'LPBtest.exe'), but when I run it, it throws an error urllib.error.HTTPError: HTTP Error 403: Forbidden.

like image 866
Jakub Bláha Avatar asked Jul 27 '17 18:07

Jakub Bláha


People also ask

What is urllib ERROR 403?

If the user (here scraper) exceeds it, it gets some kind of error, for instance, urllib.error.httperror: http error 403: forbidden. Resolving urllib.error.httperror: http error 403: forbidden? This error is caused due to mod security detecting the scraping bot of the urllib and blocking it.

What is the use of urllib?

The urllib module can be used to make an HTTP request from a site, unlike the requests library, which is a built-in library. This reduces dependencies. In the following article, we will discuss why urllib.error.httperror: http error 403: forbidden occurs and how to resolve it. What is a 403 error?

Is there a way to get through a blocked urllib?

If the server is blocking, there's probably not an easy way through. Forbidden means that you are not allowed. You seem to have already realised this; the remote server is apparently checking the user agent header and rejecting requests from Python's urllib.

What is a 403 error and how to fix it?

The 403 error pops up when a user tries to access a forbidden page or, in other words, the page they aren’t supposed to access. 403 is the HTTP status code that the webserver uses to denote the kind of problem that has occurred on the user or the server end. For instance, 200 is the status code for – ‘everything has worked as expected, no errors’.


1 Answers

That looks to be an actual HTTP 403: Forbidden error. Python urllib throws the exception when it encounters an HTTP status code (documented here). 403 in general means: "The server understood the request, but is refusing to fulfill it." You will need to add HTTP headers to identify yourself and avoid the 403 error, documentation on Python urllib headers. Here is an example using urlopen:

import urllib.request
req = urllib.request.Request('http://lolupdater.com/downloads/LPB.exe', headers={'User-Agent': 'Mozilla/5.0'})
response = urllib.request.urlopen(req)

With Python 3 urllib.urlretrieve() is considered legacy. I would recommend Python Requests for this, here is a working example:

import requests

url = 'http://lolupdater.com/downloads/LPB.exe'
r = requests.get(url)
with open('LPBtest.exe', 'wb') as outfile:
    outfile.write(r.content)
like image 131
andrew Avatar answered Nov 15 '22 08:11

andrew