Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PIL and blocking calls with asyncio

I have an asyncio app which uses server from aiohttp and async sockets with asyncio.open_connection()

My code contains some blocking calls from the PIL library like

Image.save()
Image.resize()
  1. Even though the calls are not blocking for too much time, still, can my web server freeze if I use these blocking calls? More precisely, is it possible that the event loop will miss events because of blocking code?
  2. If yes, what is the replacement for these functions, that integrate with asyncio? there is no asyncio version of PIL.
  3. In general, what is considered a 'blocking code' in asyncio? besides the obvious operations like socket, read file, etc.
    For example, does os.path.join() is considered ok? what about working on a numpy array?
like image 516
user3599803 Avatar asked Dec 13 '22 13:12

user3599803


1 Answers

can my web server freeze if I use these blocking calls? More precisely, is it possible that the event loop will miss events because of blocking code?

The server will precisely freeze while executing image functions. You won't miss any events, but all events handling will be delayed for the time the image functions are executing.

Freezing the event loop is a bad situation - you should avoid it.

If yes, what is the replacement for these functions, that integrate with asyncio? there is no asyncio version of PIL.

The easiest and universal way to avoid freezing event loop - to execute the blocking function in another thread or another process using asyncio.run_in_executor. The code snippet there shows how to do it and contains a good explanation of when to use process or thread:

def blocking_io():
    # File operations (such as logging) can block the
    # event loop: run them in a thread pool.
    with open('/dev/urandom', 'rb') as f:
        return f.read(100)

def cpu_bound():
    # CPU-bound operations will block the event loop:
    # in general it is preferable to run them in a
    # process pool.
    return sum(i * i for i in range(10 ** 7))

I only want to add that process pool may not be always a good solution for every CPU-bound operation. If your image functions don't take much time (or especially if your server doesn't have multiple processor cores) it may still be more productive to run them in a thread.

In general, what is considered a 'blocking code' in asyncio? besides the obvious operations like socket, read a file, etc. For example, does os.path.join() is considered ok? what about working on a numpy array?

Roughly saying any function is blocking: it blocks event loop for some time. But many functions like os.path.join take so little time so they're not a problem and we don't call them "blocking".

It's hard to say the exact limit when execution time (and event loop freezing) becomes a problem, especially considering that this time will be different for different hardware. My biased advice - if your code takes (or may take) > 50 ms before returning control to the event loop, consider it blocking and use run_in_executor.

Upd:

Thanks, does it make sense to use one event loop (of the main thread), and using another thread that will add tasks using the same loop?

I'm not sure what you mean here, but I think not. We need another thread to run some jobs in, not to add tasks in there.

I need some way for the thread to inform the main thread after the image processing is completed`

Just await the result of run_in_executor or start the task with it. run_in_executor - is a coroutine that executes something in a background thread without blocking the event loop.

It will look like this:

thread_pool = ThreadPoolExecutor()


def process_image(img):
    # all stuff to process image here
    img.save()
    img.resize()


async def async_image_process(img):
    await loop.run_in_executor(
        thread_pool, 
        partial(process_image, img)
    )


async def handler(request):

    asyncio.create_task(
        async_image_process(img)
    )
    # we use a task to return the response immediately,
    # read https://stackoverflow.com/a/37345564/1113207

    return web.Response(text="Image processed without blocking other requests")
like image 146
Mikhail Gerasimov Avatar answered Dec 22 '22 17:12

Mikhail Gerasimov