I have an asyncio app which uses server from aiohttp
and async sockets with asyncio.open_connection()
My code contains some blocking calls from the PIL library like
Image.save()
Image.resize()
os.path.join()
is considered ok? what about working on a numpy
array?can my web server freeze if I use these blocking calls? More precisely, is it possible that the event loop will miss events because of blocking code?
The server will precisely freeze while executing image functions. You won't miss any events, but all events handling will be delayed for the time the image functions are executing.
Freezing the event loop is a bad situation - you should avoid it.
If yes, what is the replacement for these functions, that integrate with asyncio? there is no asyncio version of PIL.
The easiest and universal way to avoid freezing event loop - to execute the blocking function in another thread or another process using asyncio.run_in_executor. The code snippet there shows how to do it and contains a good explanation of when to use process or thread:
def blocking_io():
# File operations (such as logging) can block the
# event loop: run them in a thread pool.
with open('/dev/urandom', 'rb') as f:
return f.read(100)
def cpu_bound():
# CPU-bound operations will block the event loop:
# in general it is preferable to run them in a
# process pool.
return sum(i * i for i in range(10 ** 7))
I only want to add that process pool may not be always a good solution for every CPU-bound operation. If your image functions don't take much time (or especially if your server doesn't have multiple processor cores) it may still be more productive to run them in a thread.
In general, what is considered a 'blocking code' in asyncio? besides the obvious operations like socket, read a file, etc. For example, does os.path.join() is considered ok? what about working on a numpy array?
Roughly saying any function is blocking: it blocks event loop for some time. But many functions like os.path.join
take so little time so they're not a problem and we don't call them "blocking".
It's hard to say the exact limit when execution time (and event loop freezing) becomes a problem, especially considering that this time will be different for different hardware. My biased advice - if your code takes (or may take) > 50 ms before returning control to the event loop, consider it blocking and use run_in_executor
.
Upd:
Thanks, does it make sense to use one event loop (of the main thread), and using another thread that will add tasks using the same loop?
I'm not sure what you mean here, but I think not. We need another thread to run some jobs in, not to add tasks in there.
I need some way for the thread to inform the main thread after the image processing is completed`
Just await the result of run_in_executor
or start the task with it. run_in_executor
- is a coroutine that executes something in a background thread without blocking the event loop.
It will look like this:
thread_pool = ThreadPoolExecutor()
def process_image(img):
# all stuff to process image here
img.save()
img.resize()
async def async_image_process(img):
await loop.run_in_executor(
thread_pool,
partial(process_image, img)
)
async def handler(request):
asyncio.create_task(
async_image_process(img)
)
# we use a task to return the response immediately,
# read https://stackoverflow.com/a/37345564/1113207
return web.Response(text="Image processed without blocking other requests")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With