Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

asyncio and coroutines vs task queues

I've been reading about asyncio module in python 3, and more broadly about coroutines in python, and I can't get what makes asyncio such a great tool.

I have the feeling that all you can do with coroutines, you can do better by using task queues based on the multiprocessing module (celery for example).

Are there use cases where coroutines are better than task queues?

like image 683
MG1992 Avatar asked Dec 23 '15 14:12

MG1992


People also ask

Is Asyncio Threadsafe a queue?

Although asyncio queues are not thread-safe, they are designed to be used specifically in async/await code.

Is coroutine deprecated?

Deprecated since version 3.8: If any awaitable in aws is a coroutine, it is automatically scheduled as a Task. Passing coroutines objects to wait() directly is deprecated as it leads to confusing behavior.

What is Asyncio coroutine?

async def is a new syntax from Python 3.5. You could use await , async with and async for inside async def s. @coroutine is a functional analogue for async def but it works in Python 3.4+ and utilizes yield from construction instead of await . For practical perspective just never use @coroutine if your Python is 3.5+.

Why do we need Asyncio?

asyncio is used as a foundation for multiple Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc. asyncio is often a perfect fit for IO-bound and high-level structured network code.


2 Answers

Not a proper answer, but a list of hints that could not fit into a comment:

  • You are mentioning the multiprocessing module (and let's consider threading too). Suppose you have to handle hundreds of sockets: can you spawn hundreds of processes or threads?

  • Again, with threads and processes: how do you handle concurrent access to shared resources? What is the overhead of mechanisms like locking?

  • Frameworks like Celery also add an important overhead. Can you use it e.g. for handling every single request on a high-traffic web server? By the way, in that scenario, who is responsible for handling sockets and connections (Celery for its nature can't do that for you)?

  • Be sure to read the rationale behind asyncio. That rationale (among other things) mentions a system call: writev() -- isn't that much more efficient than multiple write()s?

like image 59
Andrea Corbellini Avatar answered Oct 27 '22 07:10

Andrea Corbellini


Adding to the above answer:

If the task at hand is I/O bound and operates on a shared data, coroutines and asyncio are probably the way to go.

If on the other hand, you have CPU-bound tasks where data is not shared, a multiprocessing system like Celery should be better.

If the task at hand is a both CPU and I/O bound and sharing of data is not required, I would still use Celery.You can use async I/O from within Celery!

If you have a CPU bound task but with the need to share data, the only viable option as I see now is to save the shared data in a database. There have been recent attempts like pyparallel but they are still work in progress.

like image 33
bhaskarc Avatar answered Oct 27 '22 07:10

bhaskarc