I've been thinking about std::async
and how one should use it in future compiler implementation. However, right now I'm a bit stuck with something that feels like a design flaw.
The std::async
is pretty much implementation dependent, with probably two variants of launch::async
, one which launches the task into a new thread and one that uses a thread-pool/task-scheduler.
However, depending one which one of these variants that are used to implement std::async
, the usage would vary greatly.
For the "thread-pool" based variant you would be able to launch a lot of small tasks without worrying much about overheads, however, what if one of the tasks blocks at some point?
On the other hand a "launch new thread" variant wouldn't suffer problems with blocking tasks, on the other hand, the overhead of launching and executing tasks would be very high.
thread-pool: +low-overhead, -never ever block
launch new thread: +fine with blocks, -high overhead
So basically depending on the implementation, the way we use std::async
would wary very much. If we have a program that works well with one compiler, it might work horribly on another.
Is this by design? Or am I missing something? Would you consider this, as I do, as a big problem?
In the current specification I am missing something like std::oversubscribe(bool)
in order to enable implementation in-dependent usage of std::async
.
EDIT: As far as I have read, the C++11 standard document does not give any hints in regards to whether tasks sent to std::async
may block or not.
std::async
tasks launched with a policy of std::launch::async
run "as if in a new thread", so thread pools are not really supported --- the runtime would have to tear down and recreate all the thread-local variables in between each task execution, which is not straightforward.
This also means that you can expect tasks started with a policy of std::launch::async
to run concurrently. There may be a start-up delay, and there will be task-switching if you have more running threads than processors, but they should be running, and not deadlock just because one happens to wait for another.
An implementation may choose to offer an extension that allows your tasks to run in a thread pool, in which case it is up to that implementation to document the semantics.
I would expect implementations to launch new threads, and leave the thread pool to a future version of C++ that standardizes it. Are there any implementations that use a thread pool?
MSVC initally used a thread pool based on their Concurrency Runtime. According to STL Fixes In VS 2015, Part 2 this has been removed. The C++ specification left some room for implementers to do clever things, however I don't think it quite left enough room for this thread pooling implementation. In particular I think the spec still required that thread_local
objects would be destroyed and rebuilt, but that thread pooling with ConcRT would not have supported that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With