I am trying to create a web scraper for NBA data. When I am running the below code: <pre class="prettyprint"><code>import requests response = requests.get('https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=') </code></pre> requests are timing out with the error: <blockquote> File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 70, in get return request('get', url, params=params, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 56, in request return session.request(method=method, url=url, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 488, in request resp = self.send(prep, **send_kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 609, in send r = adapter.send(request, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 473, in send raise ConnectionError(err, request=request) ConnectionError: ('Connection aborted.', OSError("(10060, 'WSAETIMEDOUT')",)) </blockquote> However, when I hit the same URL in the browser, I am getting a response.

Looks like the website you mentioned is checking for <code>"User-Agent"</code> in the request's header. You can fake the <code>"User-Agent"</code> in your request to make it look like it is coming from the actual browser and you'll receive the response. For example: <pre class="prettyprint"><code>import requests url = "https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=" headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'} # it's the user-agent of my browser ^ response = requests.get(url, headers=headers) response.status_code # will return: 200 response.text # will return the website content </code></pre> You can find the user-agent of your browser from here.

Python's requests library timing out but getting the response from the browser

Tags:

python

python-requests

user-agent

web-scraping

I am trying to create a web scraper for NBA data. When I am running the below code:

import requests

response = requests.get('https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=')

requests are timing out with the error:

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 70, in get return request('get', url, params=params, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\api.py", line 56, in request return session.request(method=method, url=url, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 488, in request resp = self.send(prep, **send_kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\sessions.py", line 609, in send r = adapter.send(request, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 473, in send raise ConnectionError(err, request=request)

ConnectionError: ('Connection aborted.', OSError("(10060, 'WSAETIMEDOUT')",))

However, when I hit the same URL in the browser, I am getting a response.

961

asked Oct 21 '17 11:10

Michail N

3 Answers

Looks like the website you mentioned is checking for "User-Agent" in the request's header. You can fake the "User-Agent" in your request to make it look like it is coming from the actual browser and you'll receive the response.

For example:

import requests
url = "https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=10%2F20%2F2017&DateTo=10%2F20%2F2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight="

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
# it's the user-agent of my browser ^ 

response = requests.get(url, headers=headers)
response.status_code    # will return: 200

response.text      # will return the website content

You can find the user-agent of your browser from here.

answered Oct 18 '22 04:10

Moinuddin Quadri

if it's still not working, use this header:

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36','Accept-Encoding': 'gzip, deflate, br','Accept-Language': 'en-US,en;q=0.9,hi;q=0.8'}

answered Oct 18 '22 02:10

curious

If other headers does not work, try this HEADER , it worked pretty well for me.

headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15","Accept-Language": "en-gb","Accept-Encoding":"br, gzip, deflate","Accept":"test/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Referer":"http://www.google.com/"}

collected these headers from this link

answered Oct 18 '22 04:10

Vishal Stark

Related questions
                            
                                Getting the difference (in values) between two dictionaries in python
                            
                                Login Wordpress with requests - Python3
                            
                                feature_names must be unique - Xgboost
                            
                                Convert csv to JSON tree structure?
                            
                                'numpy.ndarray' object has no attribute 'imshow'
                            
                                rgb to yuv conversion and accessing Y, U and V channels
                            
                                ANOVA for groups within a dataframe using scipy
                            
                                Byte code of a compiled script differs based on how it was compiled [duplicate]
                            
                                Python class methods: when is self not needed
                            
                                Check for valid domain name in a string?
                            
                                Popping first element from a Python tuple
                            
                                How can I get sign bit of an integer in python?
                            
                                How to include the function name into logging
                            
                                all permutations of +-r, +-s
                            
                                Easy parallelization of numpy.apply_along_axis()?
                            
                                Tensorflow: ValueError: Can't load save_path when it is None in single shot detection
                            
                                How do you declare python variables within flask templates?
                            
                                Compose dynamic SQL string with psycopg2
                            
                                Keeping the last N duplicates in pandas
                            
                                ModuleNotFoundError: No module named 'cv2'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With