Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asynchronous task queues and asynchronous IO

As I understand, asynchronous networking frameworks/libraries like twisted, tornado, and asyncio provide asynchronous IO through implementing nonblocking sockets and an event loop. Gevent achieves essentially the same thing through monkey patching the standard library, so explicit asynchronous programming via callbacks and coroutines is not required.

On the other hand, asynchronous task queues, like Celery, manage background tasks and distribute those tasks across multiple threads or machines. I do not fully understand this process but it involves message brokers, messages, and workers.

My questions,

  1. Do asynchronous task queues require asynchronous IO? Are they in any way related? The two concepts seem similar, but the implementations at the application level are different. I would think that the only thing they have in common is the word "asynchronous", so perhaps that is throwing me off.

  2. Can someone elaborate on how task queues work and the relationship between the message broker (why are they required?), the workers, and the messages (what are messages? bytes?).

Oh, and I'm not trying to solve any specific problems, I'm just trying to understand the ideas behind asynchronous task queues and asynchronous IO.

like image 325
puketronic Avatar asked Apr 09 '16 14:04

puketronic


People also ask

What is asynchronous task queue?

Imagine, you are now pushing the computing part of your application to a queue. This queue would handle every task independently, minimizing user latency and ensuring that even if one of them fails, the rest of the task/data gets processed. This is Asynchronous task processing.

What does async def mean in Python?

The syntax async def introduces either a native coroutine or an asynchronous generator. The expressions async with and async for are also valid, and you'll see them later on. The keyword await passes function control back to the event loop. (It suspends the execution of the surrounding coroutine.)

What are task queues used for?

Task queues let applications perform work, called tasks, asynchronously outside of a user request. If an app needs to execute work in the background, it adds tasks to task queues. The tasks are executed later, by worker services. The Task Queue service is designed for asynchronous work.

Is Python asynchronous or synchronous?

There are two basic types of methods in the Parallels Python API: synchronous and asynchronous. When a synchronous method is invoked, it completes executing before returning to the caller. An asynchronous method starts a job in the background and returns to the caller immediately.


1 Answers

Asynchronous IO is a way to use sockets (or more generally file descriptors) without blocking. This term is specific to one process or even one thread. You can even imagine mixing threads with asynchronous calls. It would be completely fine, yet somewhat complicated.

Now I have no idea what asynchronous task queue means. IMHO there's only a task queue, it's a data structure. You can access it in asynchronous or synchronous way. And by "access" I mean push and pop calls. These can use network internally.

So task queue is a data structure. (A)synchronous IO is a way to access it. That's everything there is to it.

The term asynchronous is havily overused nowadays. The hype is real.


As for your second question:

  1. Message is just a set of data, a sequence of bytes. It can be anything. Usually these are some structured strings, like JSON.
  2. Task == message. The different word is used to notify the purpose of that data: to perform some task. For example you would send a message {"task": "process_image"} and your consumer will fire an appropriate function.
  3. Task queue Q is a just a queue (the data structure).
  4. Producer P is a process/thread/class/function/thing that pushes messages to Q.
  5. Consumer (or worker) C is a process/thread/class/function/thing that pops messages from Q and does some processing on it.
  6. Message broker B is a process that redistributes messages. In this case a producer P sends a message to B (rather then directly to a queue) and then B can (for example) duplicate this message and send to 2 different queues Q1 and Q2 so that 2 different workers C1 and C2 will get that message. Message brokers can also act as protocol translators, can transform messages, aggregate them and do many many things. Generally it's just a blackbox between producers and consumers.

As you can see there are no formal definitions of those things and you have to use a bit of intuition to fully understand them.

like image 194
freakish Avatar answered Oct 23 '22 11:10

freakish