I know TBB (Thread Building Blocks) claim to have a sophisticated engine, but from the algorithmic point of view:
If we had (say on Linux) a workqueue that has N working-threads (POSIX threads, N is the number of cores) and a mutex-synchronized queue of tasks, each working thread then taking a task from the queue when idle, also some synchronization calls, what else could TBB offer, not counting nice C++ syntax? I don't see a better algorithm than greedy assignment of tasks to cores.
As somebody who has developed their own work-stealing scheduler, I can say the following:
In fact, it’s not that hard to write a correct scheduler. Unfortunately, it is hard if you want to do it efficiently. An efficient scheduler effectively precludes the use of locks (except perhaps in very specific, well-specified situations) and lock-free cross-thread communication is a world of pain.
As an anecdote, I actually implemented one scheduler where I essentially had to copy the existing algorithm into code and I still managed to introduce almost any race condition imaginable into the code. Debugging this code was a mixture of
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With