I've been investigating different scheduling algorithms for a thread pool I am implementing. Due to the nature of the problem I am solving I can assume that the tasks being run in parallel are independent and do not spawn any new tasks. The tasks can be of varying sizes.
I went immediately for the most popular scheduling algorithm "work stealing" using lock-free deques for the local job queues, and I am relatively happy with this approach. However I'm wondering whether there are any common cases where work-stealing is not the best approach.
For this particular problem I have a good estimate of the size of each individual task. Work-stealing does not make use of this information and I'm wondering if there is any scheduler which will give better load-balancing than work-stealing with this information (obviously with the same efficiency).
NB. This question ties up with a previous question.
I'd distribute the tasks upfront. With the information of their estimated running time you can distribute them into individual queues, for each thread one.
Distributing the tasks is basically the knapsack problem, each queue should take the same amount of time.
You should add some logic to modify the queues while they run. For example a re-distribution should occur after the estimated running time differs by a certain amount from the real running time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With