Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python asyncio.semaphore in async-await function

I am trying to teach myself Python's async functionality. To do so I have built an async web scraper. I would like to limit the total number of connections I have open at once to be a good citizen on servers. I know that semaphore's are a good solution, and the asyncio library has a semaphore class built in. My issue is that Python complains when using yield from in an async function as you are combining yield and await syntax. Below is the exact syntax I am using...

import asyncio
import aiohttp

sema = asyncio.BoundedSemaphore(5)

async def get_page_text(url):
    with (yield from sema):
        try:
            resp = await aiohttp.request('GET', url)
            if resp.status == 200:
                ret_val = await resp.text()
        except:
            raise ValueError
        finally:
            await resp.release()
    return ret_val

Raising this Exception:

File "<ipython-input-3-9b9bdb963407>", line 14
    with (yield from sema):
         ^
SyntaxError: 'yield from' inside async function

Some possible solution I can think of...

  1. Just use the @asyncio.coroutine decorator
  2. Use threading.Semaphore? This seems like it may cause other issues
  3. Try this in the beta of Python 3.6 for this reason.

I am very new to Python's async functionality so I could be missing something obvious.

like image 294
Bruce Pucci Avatar asked Nov 28 '16 03:11

Bruce Pucci


People also ask

What does Asyncio Semaphore do?

From the asyncio docs: A semaphore manages an internal counter which is decremented by each acquire() call and incremented by each release() call. The counter can never go below zero; when acquire() finds that it is zero, it blocks, waiting until some task calls release() .

What is the difference between a Semaphore and bounded Semaphore?

A Semaphore can be released more times than it's acquired, and that will raise its counter above the starting value. A BoundedSemaphore can't be raised above the starting value.

Why is Asyncio better than threads?

Asyncio vs threading: Async runs one block of code at a time while threading just one line of code at a time. With async, we have better control of when the execution is given to other block of code but we have to release the execution ourselves.

Why is Asyncio not thread-safe?

Simply speaking, thread-safe means that it is safe when more than one thread access the same resource and I know Asyncio use a single thread fundamentally. However, more than one Asyncio Task could access a resource multiple time at a time like multi-threading .


1 Answers

You can use the async with statement to get an asynchronous context manager:

#!/usr/local/bin/python3.5
import asyncio
from aiohttp import ClientSession


sema = asyncio.BoundedSemaphore(5)

async def hello(url):
    async with ClientSession() as session:
        async with sema, session.get(url) as response:
            response = await response.read()
            print(response)

loop = asyncio.get_event_loop()
loop.run_until_complete(hello("http://httpbin.org/headers"))

Example taken from here. The page is also a good primer for asyncio and aiohttp in general.

like image 59
Sebastian Wozny Avatar answered Oct 12 '22 04:10

Sebastian Wozny