I'm playing around with the Angel List (AL) API and want to pull all jobs in San San Francisco. Since I couldn't find an active Python wrapper for the api (if I make any headway, I think I'd like to make my own), I'm using the requests library.
The AL API's results are paginated, and I can't figure out how to move beyond the first page of the results.
Here is my code:
import requests r_sanfran = requests.get("https://api.angel.co/1/tags/1664/jobs").json() r_sanfran.keys() # returns [u'per_page', u'last_page', u'total', u'jobs', u'page'] r_sanfran['last_page'] #returns 16 r_sanfran['page'] # returns 1
I tried adding arguments to requests.get
, but that didn't work. I also tried something really dumb - changing the value of the 'page' key like that was magically going to paginate for me.
eg. r_sanfran['page'] = 2
I'm guessing it's something relatively simple, but I can't seem to figure it out so any help would be awesome.
Thanks as always.
Angel List API documentation if it's helpful.
Paginated JSON will usually have an object with links to the previous and next JSON pages. To get the previous page, you must send a request to the "prev" URL. To get to the next page, you must send a request to the "next" URL. This will deliver a new JSON with new results and new links for the next and previous pages.
Improving on @alecxe's answer: if you use a Python Generator and a requests HTTP session you can improve the performance and resource usage if you are querying lots of pages or very large pages.
import requests session = requests.Session() def get_jobs(): url = "https://api.angel.co/1/tags/1664/jobs" first_page = session.get(url).json() yield first_page num_pages = first_page['last_page'] for page in range(2, num_pages + 1): next_page = session.get(url, params={'page': page}).json() yield next_page for page in get_jobs(): # TODO: process the page
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With