Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the Visual C++ implementation of std::async using a thread pool legal

Visual C++ uses the Windows thread pool (Vista's CreateThreadpoolWork if available and QueueUserWorkItem if not) when calling std::async with std::launch::async.

The number of threads in the pool is limited. If create several tasks that run for a long time without sleeping (including doing I/O), the upcoming tasks in the queue won't get a chance to work.

The standard (I'm using N4140) says that using std::async with std::launch::async

... calls INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...) (20.9.2, 30.3.1.2) as if in a new thread of execution represented by a thread object with the calls to DECAY_COPY() being evaluated in the thread that called async.

(§30.6.8p3, Emphasis mine.)

std::thread's constructor creates a new thread etc.

About threads in general it says (§1.10p3):

Implementations should ensure that all unblocked threads eventually make progress. [Note: Standard library functions may silently block on I/O or locks. Factors in the execution environment, including externally-imposed thread priorities, may prevent an implementation from making certain guarantees of forward progress. —end note]

If I create a bunch of OS threads or std::threads, all performing some very long (perhaps infinite) tasks, they'll all be scheduled (at least on Windows; without messing with priorities, affinities, etc.). If we schedule the same tasks to the Windows thread pool (or use std::async(std::launch::async, ...) which does that), the later scheduled tasks won't run until the earlier tasks will finish.

Is this legal, strictly speaking? And what does "eventually" mean?


The problem is that if the tasks scheduled first are de-facto infinite, the rest of the tasks won't run. So the other threads (not OS threads, but "C++-threads" according to the as-if rule) won't make progress.

One may argue that if the code has infinite loops the behavior is undefined, and thus it's legal.

But I argue that we don't need an infinite loop of the problematic kind the standard says causes UB to make that happen. Accessing volatile objects, performing atomic operation and synchronization operations are all side effects that "disable" the assumption about loops terminating.

(I have a bunch of async calls executing the following lambda

auto lambda = [&] {     while (m.try_lock() == false) {         for (size_t i = 0; i < (2 << 24); i++) {             vi++;         }         vi = 0;     } }; 

and the lock is released only upon user input. But there are other valid kinds of legitimate infinite loops.)

If I schedule a couple of such tasks, tasks I schedule after them don't get to run.

A really wicked example would be launching too many tasks that run until a lock is release/a flag is raised and then schedule using `std::async(std::launch::async, ...) a task that raises the flag. Unless the word "eventually" means something very surprising, this program has to terminate. But under the VC++ implementation it won't!

To me it seems like a violation of the standard. What makes me wonder is the second sentence in the note. Factors may prevent implementations from making certain guarantees of forward progress. So how are these implementation conforming?

It's like saying there may be factors preventing implementations from providing certain aspect of memory ordering, atomicity, or even the existence of multiple threads of execution. Great, but conforming hosted implementations must support multiple threads. Too bad for them and their factors. If they can't provide them that's not C++.

Is this a relaxation of the requirement? If interpreting so, it's a complete withdrawal of the requirement, since it doesn't specify what are the factors and, more importantly, which guarantees may be not supplied by the implementations.

If not - what does that note even mean?

I recall footnotes being non-normative according to the ISO/IEC Directives, but I'm not sure about notes. I did find in the ISO/IEC directives the following:

24 Notes

24.1 Purpose or rationale

Notes are used for giving additional information intended to assist the understanding or use of the text of the document. The document shall be usable without the notes.

Emphasis mine. If I consider the document without that unclear note, seems to me like threads must make progress, std::async(std::launch::async, ...) has the effect as-if the functor is execute on a new thread, as-if it was being created using std::thread, and thus a functors dispatched using std::async(std::launch::async, ...) must make progress. And in the VC++ implementation with the threadpool they don't. So VC++ is in violation of the standard in this respect.


Full example, tested using VS 2015U3 on Windows 10 Enterprise 1607 on i5-6440HQ:

#include <iostream> #include <future> #include <atomic>  int main() {     volatile int vi{};     std::mutex m{};     m.lock();      auto lambda = [&] {         while (m.try_lock() == false) {             for (size_t i = 0; i < (2 << 10); i++) {                 vi++;             }             vi = 0;         }         m.unlock();     };      std::vector<decltype(std::async(std::launch::async, lambda))> v;      int threadCount{};     std::cin >> threadCount;     for (int i = 0; i < threadCount; i++) {         v.emplace_back(std::move(std::async(std::launch::async, lambda)));     }      auto release = std::async(std::launch::async, [&] {         __asm int 3;         std::cout << "foo" << std::endl;         vi = 123;         m.unlock();     });          return 0; } 

With 4 or less it terminates. With more than 4 it doesn't.


Similar questions:

  • Is there an implementation of std::async which uses thread pool? - But it doesn't question about legality, and doesn't have an answer anyway.

  • std::async - Implementation dependent usage? - Mentions that "thread pools are not really supported" but focuses on thread_local variables (which is solvable even if "not straightforward" or non-trivial as the answer and comment say) and doesn't address the note near the requirement of making progress.

like image 617
conio Avatar asked Mar 12 '17 23:03

conio


People also ask

Does STD async use thread pool?

How does std::launch::async Work in Different Implementations? For now, we know that if no policy is specified, then std::async launches a callable function in a separate thread. However, the C++ standard does not specify whether the thread is a new one or reused from a thread pool.

What is a thread pool C++?

Threadpool in C++ is basically a pool having a fixed number of threads used when we want to work multiple tasks together (run multiple threads concurrently). This thread sits idle in the thread pool when there are no tasks and when a task arrives, it is sent to the thread pool and gets assigned to the thread.

What is std :: async?

The function template async runs the function f asynchronously (potentially in a separate thread which might be a part of a thread pool) and returns a std::future that will eventually hold the result of that function call. 1) Behaves as if (2) is called with policy being std::launch::async | std::launch::deferred.

Does STD async create new thread?

One nice thing about std::async is that it manages a thread pool under the hood. So there is no worry that every time we invoke std::async a new thread is launched.


1 Answers

The situation has been clarified somewhat in C++17 by P0296R2. Unless the Visual C++ implementation documents that its threads do not provide concurrent forward progress guarantees (which would be generally undesirable), the bounded thread pool is not conforming (in C++17).

The note about "externally imposed thread priorities" has been removed, perhaps because it is already always possible for the environment to prevent the progress of a C++ program (if not by priority, then by being suspended, and if not that, then by power or hardware failure).

There is one remaining normative "should" in that section, but it pertains (as conio mentioned) only to lock-free operations, which can be delayed indefinitely by frequent concurrent access by other thread to the same cache line (not merely the same atomic variable). (I think that in some implementations this can happen even if the other threads are only reading.)

like image 171
Davis Herring Avatar answered Sep 24 '22 12:09

Davis Herring