Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: requests.get, iterating url in a loop

Tags:

python

I am trying to get information from stats.nba.com by iterating requests.get(url) in a for loop, where the url changes at every iteration. If I just iterate it once it works but twice or more seems to give errors and I'm not sure why. I'm new to programming so any info will be helpful. Thanks in advance. Here's my code:

import requests
import json

team_id = 1610612737

def get_data(url):
    response = requests.get(url)
    if response.status_code == 200:
        data = response.json()
        return data
    else:
        print(response.text)
        print(response.status_code)

for i in range(30): # 30 NBA Teams
    base_url = "http://stats.nba.com/stats/teamdetails?teamID="   
    team_url = base_url + str(team_id)
    data = get_data(team_url)

    ## Do stuff ##

   team_id +=1

If I do 'for i in range(1):' it works, but I get status_code = 400 for each iteration if the range is greater than 1. Thanks for the help!

like image 592
pl0222 Avatar asked Apr 26 '16 01:04

pl0222


1 Answers

The website limits requests per second, so you'll need to include specific request headers or put a delay in your script (the first option being the quickest and likely most reliable of the two).

Headers Method:

'''
add under team_id = 1610612737
'''

HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)'
                          'AppleWebKit/537.36 (KHTML, like Gecko)'
                          'Chrome/45.0.2454.101 Safari/537.36'),
                          'referer': 'http://stats.nba.com/scores/'}

Then add this to your response get:

response = requests.get(url, headers=HEADERS)

*You shouldn't need to have a delay in your script at all if you use this method.

Delay Method:

import time
time.sleep(10) # delays for 10 seconds (put in your loop)

Seems like hit or miss using a delay, so I'd not recommend using unless absolutely necessary.

like image 171
l'L'l Avatar answered Oct 06 '22 15:10

l'L'l