Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maintaining Order in a Multi-Threaded Pipeline

I'm considering a multi-threaded architecture for a processing pipeline. My main processing module has an input queue, from which it receives data packets. It then performs transformations on these packets (decryption, etc.) and places them into an output queue.

The threading comes in where many input packets can have their contents transformed independently from one another.

However, the punchline is that the output queue must have the same ordering as the input queue (i.e., the first pulled off the input queue must be the first pushed onto the output queue, regardless of whether its transformations finished first.)

Naturally, there will be some kind of synchronisation at the output queue, so my question is: what would be the best way of ensuring that this ordering is maintained?

like image 899
Kaz Dragon Avatar asked Jul 12 '10 08:07

Kaz Dragon


2 Answers

Have a single thread read the input queue, post a placeholder on the output queue, and then hand the item over to a worker thread to process. When the data is ready the worker thread updates the placeholder. When the thread that needs the value from the output queue reads the placeholder it can then block until the associated data is ready.

Because only a single thread reads the input queue, and this thread immediately puts the placeholder on the output queue, the order in the output queue is the same as that in the input. The worker threads can be numerous, and can do the transformations in any order.

On platforms that support futures, they are ideal as the placeholder. On other systems you can use an event, monitor or condition variable.

like image 189
Anthony Williams Avatar answered Sep 28 '22 06:09

Anthony Williams


With the following assumptions

  • there should be one input queue, one output queue and one working queue
  • there should be only one input queue listener
  • output message should contain a wait handle and a pointer to worker/output data
  • there may be an arbitrary number of worker threads

I would consider the following flow:

Input queue listener does these steps:

  1. extracts input message;
  2. creates output message:
    1. initializes worker data struct
    2. resets the wait handle
  3. enqueues the pointer to the output message into the working queue
  4. enqueues the pointer to the output message into the output queue

Worker thread does the following:

  1. waits on a working queue to extract a pointer to an output message from it
  2. processes the message based on the given data and sets the event when done

consumer does the following:

  1. waits on n output queue to extract a pointer to an output message from it
  2. waits on a handle until the output data is ready
  3. does something with the data
like image 34
ULysses Avatar answered Sep 28 '22 04:09

ULysses