Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python's requests library timing out but getting the response from the browser

I am trying to create a web scraper for NBA data. When I am running the below code:

import requests

response = requests.get('https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=')

requests are timing out with the error:

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 70, in get return request('get', url, params=params, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 56, in request return session.request(method=method, url=url, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 488, in request resp = self.send(prep, **send_kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 609, in send r = adapter.send(request, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 473, in send raise ConnectionError(err, request=request)

ConnectionError: ('Connection aborted.', OSError("(10060, 'WSAETIMEDOUT')",))

However, when I hit the same URL in the browser, I am getting a response.

like image 961
Michail N Avatar asked Oct 21 '17 11:10

Michail N


People also ask

What is the default timeout for Python requests?

The default timeout is None , which means it'll wait (hang) until the connection is closed.

Does Python requests wait for response?

It will wait until the response arrives before the rest of your program will execute. If you want to be able to do other things, you will probably want to look at the asyncio or multiprocessing modules. Chad S. Chad S.


3 Answers

Looks like the website you mentioned is checking for "User-Agent" in the request's header. You can fake the "User-Agent" in your request to make it look like it is coming from the actual browser and you'll receive the response.

For example:

import requests
url = "https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight="

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
# it's the user-agent of my browser ^ 

response = requests.get(url, headers=headers)
response.status_code    # will return: 200

response.text      # will return the website content

You can find the user-agent of your browser from here.

like image 83
Moinuddin Quadri Avatar answered Oct 18 '22 04:10

Moinuddin Quadri


if it's still not working, use this header:

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36','Accept-Encoding': 'gzip, deflate, br','Accept-Language': 'en-US,en;q=0.9,hi;q=0.8'}
like image 30
curious Avatar answered Oct 18 '22 02:10

curious


If other headers does not work, try this HEADER , it worked pretty well for me.

headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15","Accept-Language": "en-gb","Accept-Encoding":"br, gzip, deflate","Accept":"test/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Referer":"http://www.google.com/"}

collected these headers from this link

like image 1
Vishal Stark Avatar answered Oct 18 '22 04:10

Vishal Stark