Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Requests library returns wrong status code

The Python code below returns '403':

import requests
url = 'http://bedstardirect.co.uk/star-collection-braemar-double-bedstead.html'
r = requests.get(url)
print r.status_code

But this page is valid and the script should return '200', as does the perl script below:

use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
my $url = 'http://bedstardirect.co.uk/star-collection-braemar-double-bedstead.html';
$mech->get($url);
print $mech->status,"\n";

I have also checked with Firebug in Firefox and all requests have a '200' status code.

I use Python Requests v1.2.0.

like image 437
David M. Avatar asked Mar 25 '23 04:03

David M.


1 Answers

Seems your particular server requires a User-Agent header.

Try:
r = requests.get('http://bedstardirect.co.uk/star-collection-braemar-double-bedstead.html', headers={'User-Agent': 'a user agent'})

Edit:
The default User-Agent on requests for my machine comes out as: python-requests/1.2.0 CPython/2.7.4 Darwin/12.3.0

After some testing I found that any User-Agent that contains the word python will fail on this server.

like image 135
David McKeone Avatar answered Apr 06 '23 07:04

David McKeone