Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to distribute tasks considering latency and efficiency

I'm looking for an algorithm to distribute some tasks. The problem is as follows:

Say I have a central task producer and some client consumers. The producer generates tasks and consumers take tasks (for starters, one at a time), process them, and when they are done, take new tasks (I already have a task queue).

The thing is, if you consider latency for a task to get from the producer to the consumer, it might make sense to group tasks together. For example, say we have 10 tasks in total and 2 consumers. If each of the tasks take 5 ms to get processed and the network latency is also 5 ms, sending 2 groups of 5 tasks each to each consumer will take 5ms + 5*5ms = 30ms, while sending the tasks individually would take 5*5ms + 5*5ms = 50ms, because the latency overhead appears for every task.

It's not as simple as grouping since some tasks will probably take longer, and it would make sense to send them separate as to let the other tasks that take a shorter time get processed in parallel by the other consumers. I'm planning on doing some statistics regarding the type of tasks. The number of consumers is also not constant.

Any idea of a good algorithm or a good read that can help me in achieving this?

like image 347
Luchian Grigore Avatar asked Nov 05 '22 11:11

Luchian Grigore


1 Answers

At the moment a producer generates a task, not sending it right away will only increase that task's latency. Therefore I will assume that the task dispatcher works on snapshots of the current task queue: it takes all the tasks in the queue, sends them immediately in all directions, goes back to the queue, again takes all the tasks accumulated in the meantime, lather, rinse, repeat.

The dispatcher maintains an estimate of the completion time of each consumer. It orders the consumers according to increasing completion time and adds a task to the batch of the consumer with earliest completion time. Then it adds the average task time to that consumers completion time estimate, thus obtaining new estimate, then reorders the consumers according to the new estimates (in O(log n) using a heap) and goes to the next task. After all the tasks of the current snapshot are processed, send batches to the consumers and go make a new snapshot.

This policy will achieve equal consumer load on average. It can be improved:

  • if each consumer is able to provide some feedback about the estimated completion time: it's the average task time multiplied by the number of tasks pending in the consumer. It's more precise because the consumer will use the actual time of the completed tasks instead of the average

  • if the time to process each task is either known or can be estimated per-task, so the dispatcher will use a per-task estimate instead of an average.

EDIT: Forgot to mention:

The completion time is estimated as start-time + average-task-time * number-of-tasks-sent-to-a-consumer + latency * number-of-batches-sent-to-a-consumer.

like image 156
chill Avatar answered Nov 09 '22 04:11

chill