Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python requests. 403 Forbidden

I needed to parse a site, but i got an error 403 Forbidden. Here is a code:

url = 'http://worldagnetwork.com/' result = requests.get(url) print(result.content.decode()) 

Its output:

<html> <head><title>403 Forbidden</title></head> <body bgcolor="white"> <center><h1>403 Forbidden</h1></center> <hr><center>nginx</center> </body> </html> 

Please, say what the problem is.

like image 562
Толкачёв Иван Avatar asked Jul 20 '16 19:07

Толкачёв Иван


People also ask

How do I fix 403 Forbidden in Python?

The easy way to resolve the error is by passing a valid user-agent as a header parameter, as shown below. Alternatively, you can even set a timeout if you are not getting the response from the website. Python will raise a socket exception if the website doesn't respond within the mentioned timeout period.

What is the meaning of HTTP status code 403?

The HTTP 403 Forbidden response status code indicates that the server understands the request but refuses to authorize it. This status is similar to 401 , but for the 403 Forbidden status code re-authenticating makes no difference.


1 Answers

It seems the page rejects GET requests that do not identify a User-Agent. I visited the page with a browser (Chrome) and copied the User-Agent header of the GET request (look in the Network tab of the developer tools):

import requests url = 'http://worldagnetwork.com/' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'} result = requests.get(url, headers=headers) print(result.content.decode())  # <!doctype html> # <!--[if lt IE 7 ]><html class="no-js ie ie6" lang="en"> <![endif]--> # <!--[if IE 7 ]><html class="no-js ie ie7" lang="en"> <![endif]--> # <!--[if IE 8 ]><html class="no-js ie ie8" lang="en"> <![endif]--> # <!--[if (gte IE 9)|!(IE)]><!--><html class="no-js" lang="en"> <!--<![endif]--> # ... 
like image 134
A. Garcia-Raboso Avatar answered Sep 18 '22 00:09

A. Garcia-Raboso