I am having trouble wrapping my ahead around Python 3's Asyncio library. I have a list of zipcodes and I am trying to make async calls to an API to get each zipcodes corresponding city and state. I can do it successfully in sequence with a for loop but I want to make it faster in the case of a big zipcode list.
This is an example of my original that works
import urllib.request, json
zips = ['90210', '60647']
def get_cities(zipcodes):
zip_cities = dict()
for idx, zipcode in enumerate(zipcodes):
url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
response = urllib.request.urlopen(url)
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
return zip_cities
results = get_cities(zips)
print(results)
# returns {0: ['90210', 'Beverly Hills', 'California'],
# 1: ['60647', 'Chicago', 'Illinois']}
This is my terrible non-functional attempt at trying to make it async
import asyncio
import urllib.request, json
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcodes):
url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
response = urllib.request.urlopen(url)
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
loop = asyncio.get_event_loop()
loop.run_until_complete([get_cities(zip) for zip in zips])
loop.close()
print(zip_cities) # doesnt work
Any help is much appreciated. All of the tutorials I've come across online seem to be a tad over my head.
Note: I've seen some examples use aiohttp
. I was hoping to stick with the native Python 3 libraries if possible.
When we utilize asyncio we create objects called coroutines. A coroutine can be thought of as executing a lightweight thread. Much like we can have multiple threads running at the same time, each with their own concurrent I/O operation, we can have many coroutines running alongside one another.
aiofiles is an Apache2 licensed library, written in Python, for handling local disk files in asyncio applications. Ordinary local file IO is blocking, and cannot easily and portably made asynchronous. This means doing file IO may interfere with asyncio applications, which shouldn't block the executing thread.
You can add keys and to dict in a loop in python. Add an item to a dictionary by inserting a new index key into the dictionary, then assigning it a particular value.
Asyncio stands for asynchronous input output and refers to a programming paradigm which achieves high concurrency using a single thread or event loop.
You're not going to be able to get any concurrency if you use urllib
to do the HTTP request, because it's a synchronous library. Wrapping the function that calls into urllib
in a coroutine
doesn't change that. You have to use an asynchronous HTTP client that's integrated into asyncio
, like aiohttp
:
import asyncio
import json
import aiohttp
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcode,idx):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
response = yield from aiohttp.request('get', url)
string = (yield from response.read()).decode('utf-8')
data = json.loads(string)
print(data)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
if __name__ == "__main__":
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities)
I know you prefer to only use the stdlib, but the asyncio
library doesn't include an HTTP client, so you'd have to basically re-implement pieces of aiohttp
to recreate the functionality its providing. I suppose another option would be to make the urllib
calls in a background thread, so that they don't block the event loop, but its kind of silly to do when aiohttp
is available (and sort of defeats the purpose of using asyncio
in the first place):
import asyncio
import json
import urllib.request
from concurrent.futures import ThreadPoolExecutor
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcode,idx):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url)
string = response.read().decode('utf-8')
data = json.loads(string)
print(data)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
if __name__ == "__main__":
executor = ThreadPoolExecutor(10)
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With