I am having trouble wrapping my ahead around Python 3's Asyncio library. I have a list of zipcodes and I am trying to make async calls to an API to get each zipcodes corresponding city and state. I can do it successfully in sequence with a for loop but I want to make it faster in the case of a big zipcode list. This is an example of my original that works <pre class="prettyprint"><code>import urllib.request, json zips = ['90210', '60647'] def get_cities(zipcodes): zip_cities = dict() for idx, zipcode in enumerate(zipcodes): url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true' response = urllib.request.urlopen(url) string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) return zip_cities results = get_cities(zips) print(results) # returns {0: ['90210', 'Beverly Hills', 'California'], # 1: ['60647', 'Chicago', 'Illinois']} </code></pre> This is my terrible non-functional attempt at trying to make it async <pre class="prettyprint"><code>import asyncio import urllib.request, json zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcodes): url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true' response = urllib.request.urlopen(url) string = response.read().decode('utf-8') data = json.loads(string) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) loop = asyncio.get_event_loop() loop.run_until_complete([get_cities(zip) for zip in zips]) loop.close() print(zip_cities) # doesnt work </code></pre> Any help is much appreciated. All of the tutorials I've come across online seem to be a tad over my head. Note: I've seen some examples use <code>aiohttp</code>. I was hoping to stick with the native Python 3 libraries if possible.

You're not going to be able to get any concurrency if you use <code>urllib</code> to do the HTTP request, because it's a synchronous library. Wrapping the function that calls into <code>urllib</code> in a <code>coroutine</code> doesn't change that. You have to use an asynchronous HTTP client that's integrated into <code>asyncio</code>, like <code>aiohttp</code>: <pre class="prettyprint"><code>import asyncio import json import aiohttp zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcode,idx): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true' response = yield from aiohttp.request('get', url) string = (yield from response.read()).decode('utf-8') data = json.loads(string) print(data) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) if __name__ == "__main__": loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)] loop.run_until_complete(asyncio.wait(tasks)) loop.close() print(zip_cities) </code></pre> I know you prefer to only use the stdlib, but the <code>asyncio</code> library doesn't include an HTTP client, so you'd have to basically re-implement pieces of <code>aiohttp</code> to recreate the functionality its providing. I suppose another option would be to make the <code>urllib</code> calls in a background thread, so that they don't block the event loop, but its kind of silly to do when <code>aiohttp</code> is available (and sort of defeats the purpose of using <code>asyncio</code> in the first place): <pre class="prettyprint"><code>import asyncio import json import urllib.request from concurrent.futures import ThreadPoolExecutor zips = ['90210', '60647'] zip_cities = dict() @asyncio.coroutine def get_cities(zipcode,idx): url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true' response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url) string = response.read().decode('utf-8') data = json.loads(string) print(data) city = data['results'][0]['address_components'][1]['long_name'] state = data['results'][0]['address_components'][3]['long_name'] zip_cities.update({idx: [zipcode, city, state]}) if __name__ == "__main__": executor = ThreadPoolExecutor(10) loop = asyncio.get_event_loop() tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)] loop.run_until_complete(asyncio.wait(tasks)) loop.close() print(zip_cities) </code></pre>

Making multiple calls with asyncio and adding result to a dictionary

Tags:

python

python-3.x

python-asyncio

I am having trouble wrapping my ahead around Python 3's Asyncio library. I have a list of zipcodes and I am trying to make async calls to an API to get each zipcodes corresponding city and state. I can do it successfully in sequence with a for loop but I want to make it faster in the case of a big zipcode list.

This is an example of my original that works

import urllib.request, json

zips = ['90210', '60647']

def get_cities(zipcodes):
    zip_cities = dict()
    for idx, zipcode in enumerate(zipcodes):
        url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
        response = urllib.request.urlopen(url)
        string = response.read().decode('utf-8')
        data = json.loads(string)
        city = data['results'][0]['address_components'][1]['long_name']
        state = data['results'][0]['address_components'][3]['long_name']
        zip_cities.update({idx: [zipcode, city, state]})
    return zip_cities

results = get_cities(zips)
print(results)
# returns {0: ['90210', 'Beverly Hills', 'California'],
#          1: ['60647', 'Chicago', 'Illinois']}

This is my terrible non-functional attempt at trying to make it async

import asyncio
import urllib.request, json

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcodes):
    url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
    response = urllib.request.urlopen(url)
    string = response.read().decode('utf-8')
    data = json.loads(string)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

loop = asyncio.get_event_loop()
loop.run_until_complete([get_cities(zip) for zip in zips])
loop.close()
print(zip_cities) # doesnt work

Any help is much appreciated. All of the tutorials I've come across online seem to be a tad over my head.

Note: I've seen some examples use aiohttp. I was hoping to stick with the native Python 3 libraries if possible.

231

asked Aug 22 '15 22:08

anthony-dandrea

1 Answers

You're not going to be able to get any concurrency if you use urllib to do the HTTP request, because it's a synchronous library. Wrapping the function that calls into urllib in a coroutine doesn't change that. You have to use an asynchronous HTTP client that's integrated into asyncio, like aiohttp:

import asyncio
import json
import aiohttp

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcode,idx):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
    response = yield from aiohttp.request('get', url)
    string = (yield from response.read()).decode('utf-8')
    data = json.loads(string)
    print(data)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

if __name__ == "__main__":        
    loop = asyncio.get_event_loop()
    tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
    print(zip_cities)

I know you prefer to only use the stdlib, but the asyncio library doesn't include an HTTP client, so you'd have to basically re-implement pieces of aiohttp to recreate the functionality its providing. I suppose another option would be to make the urllib calls in a background thread, so that they don't block the event loop, but its kind of silly to do when aiohttp is available (and sort of defeats the purpose of using asyncio in the first place):

import asyncio
import json
import urllib.request
from concurrent.futures import ThreadPoolExecutor

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcode,idx):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
    response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url)
    string = response.read().decode('utf-8')
    data = json.loads(string)
    print(data)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

if __name__ == "__main__":
    executor = ThreadPoolExecutor(10)
    loop = asyncio.get_event_loop()
    tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
    print(zip_cities)

165

answered Oct 25 '22 18:10

dano

Related questions
                            
                                Python: Split a string, respect and preserve quotes [duplicate]
                            
                                How to add stdout and stderr to logger file in flask
                            
                                Python gzip refuses to read uncompressed file
                            
                                Scrapy: How to manually insert a request from a spider_idle event callback?
                            
                                Writing xlwt dates with Excel 'date' format
                            
                                How do I align text output in python?
                            
                                Django : Can we use .exclude() on .get() in django querysets
                            
                                sqlalchemy.exc.CircularDependencyError: Circular dependency detected
                            
                                Python closure vs javascript closure
                            
                                Is wordnet path similarity commutative?
                            
                                pandas equivalent of Stata's encode
                            
                                How to access axis label object in matplotlib?
                            
                                Regex validation with WTForms and python
                            
                                What does a "Could not find .egg-info directory in install record" from pip mean?
                            
                                plotting multiple plots generated inside a for loop on the same axes python
                            
                                pytest -- how do I use global / session-wide fixtures?
                            
                                how to save an array as a grayscale image with matplotlib/numpy?
                            
                                Restrict static file access to logged in users
                            
                                Reindexing after pandas.drop_duplicates
                            
                                pyplot/matplotlib Bar chart with fill color depending on value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With