I need to retrieve the 500 most popular films from a REST API, but the results are limited to 20 per page and I am only able to make 40 calls every 10 seconds (https://developers.themoviedb.org/3/getting-started/request-rate-limiting). I am unable to loop through the paginated results dynamically, so that the 500 most popular results are in a single list.
I can successfully return the top 20 most popular films (see below) and enumerate the number of the film, but I am getting stuck working through the loop that allows me to paginate through the top 500 without timing out due to the API rate limit.
import requests #to make TMDB API calls
#Discover API url filtered to movies >= 2004 and containing Drama genre_ID: 18
discover_api = 'https://api.themoviedb.org/3/discover/movie?
api_key=['my api key']&language=en-US&sort_by=popularity.desc&include_adult=false&include_video=false&primary_release_year=>%3D2004&with_genres=18'
#Returning all drama films >= 2004 in popularity desc
discover_api = requests.get(discover_api).json()
most_popular_films = discover_api['results']
#printing movie_id and movie_title by popularity desc
for i, film in enumerate(most_popular_films):
print(i, film['id'], film['title'])
Sample response:
{
"page": 1,
"total_results": 101685,
"total_pages": 5085,
"results": [
{
"vote_count": 13,
"id": 280960,
"video": false,
"vote_average": 5.2,
"title": "Catarina and the others",
"popularity": 130.491,
"poster_path": "/kZMCbp0o46Tsg43omSHNHJKNTx9.jpg",
"original_language": "pt",
"original_title": "Catarina e os Outros",
"genre_ids": [
18,
9648
],
"backdrop_path": "/9nDiMhvL3FtaWMsvvvzQIuq276X.jpg",
"adult": false,
"overview": "Outside, the first sun rays break the dawn. Sixteen years old Catarina can't fall asleep. Inconsequently, in the big city adults are moved by desire... Catarina found she is HIV positive. She wants to drag everyone else along.",
"release_date": "2011-03-01"
},
{
"vote_count": 9,
"id": 531309,
"video": false,
"vote_average": 4.6,
"title": "Brightburn",
"popularity": 127.582,
"poster_path": "/roslEbKdY0WSgYaB5KXvPKY0bXS.jpg",
"original_language": "en",
"original_title": "Brightburn",
"genre_ids": [
27,
878,
18,
53
],
I need the the python loop to append the paginated results into a single list until I have captured the 500 most popular films.
Desired Output:
Movie_ID Movie_Title
280960 Catarina and the others
531309 Brightburn
438650 Cold Pursuit
537915 After
50465 Glass
457799 Extremely Wicked, Shockingly Evil and Vile
Paginated JSON will usually have an object with links to the previous and next JSON pages. To get the previous page, you must send a request to the "prev" URL. To get to the next page, you must send a request to the "next" URL. This will deliver a new JSON with new results and new links for the next and previous pages.
Most APIs include a next_url
field to help you loop through all results. Let's examine some cases.
next_url
fieldYou can just loop through all pages until results
field is empty:
import requests #to make TMDB API calls
#Discover API url filtered to movies >= 2004 and containing Drama genre_ID: 18
discover_api_url = 'https://api.themoviedb.org/3/discover/movie?
api_key=['my api key']&language=en-US&sort_by=popularity.desc&include_adult=false&include_video=false&primary_release_year=>%3D2004&with_genres=18'
most_popular_films = []
new_results = True
page = 1
while new_results:
discover_api = requests.get(discover_api_url + f"&page={page}").json()
new_results = discover_api.get("results", [])
most_popular_films.extend(new_results)
page += 1
#printing movie_id and movie_title by popularity desc
for i, film in enumerate(most_popular_films):
print(i, film['id'], film['title'])
total_pages
fieldimport requests #to make TMDB API calls
#Discover API url filtered to movies >= 2004 and containing Drama genre_ID: 18
discover_api_url = 'https://api.themoviedb.org/3/discover/movie?
api_key=['my api key']&language=en-US&sort_by=popularity.desc&include_adult=false&include_video=false&primary_release_year=>%3D2004&with_genres=18'
discover_api = requests.get(discover_api_url).json()
most_popular_films = discover_api["results"]
for page in range(2, discover_api["total_pages"]+1):
discover_api = requests.get(discover_api_url + f"&page={page}").json()
most_popular_films.extend(discover_api["results"])
#printing movie_id and movie_title by popularity desc
for i, film in enumerate(most_popular_films):
print(i, film['id'], film['title'])
next_url
field exists! Yay!Same idea, only now we check for the emptiness of the next_url
field - If it's empty, it's the last page.
import requests #to make TMDB API calls
#Discover API url filtered to movies >= 2004 and containing Drama genre_ID: 18
discover_api = 'https://api.themoviedb.org/3/discover/movie?
api_key=['my api key']&language=en-US&sort_by=popularity.desc&include_adult=false&include_video=false&primary_release_year=>%3D2004&with_genres=18'
discover_api = requests.get(discover_api).json()
most_popular_films = discover_api["results"]
while discover_api["next_url"]:
discover_api = requests.get(discover_api["next_url"]).json()
most_popular_films.extend(discover_api["results"])
#printing movie_id and movie_title by popularity desc
for i, film in enumerate(most_popular_films):
print(i, film['id'], film['title'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With