I use std::for_each
with std::execution::par
to perform complex computation on huge input represented as vector of structures. The computation doesn't need any delays related to hardware (network or disk IO for example), it is "just CPU" computation. For me it looks logical that there are no sense to create more OS threads that we have hardware ones; however, Visual C++ 2019 creates in average 50 threads, and sometimes up to 500 ones even there are only 12 hardware threads.
Is there a way to limit parallel threads count to hardware_concurrency
with std::for_each
and std::execution::par
, or the only way to create reasonable threads count is to use custom code with std::thread
?
Show activity on this post. No, it won't start 1000 threads - yes, it will limit how many threads are used. Parallel Extensions uses an appropriate number of cores, based on how many you physically have and how many are already busy.
Use another overload of Parallel.Foreach that takes a ParallelOptions instance, and set MaxDegreeOfParallelism to limit how many instances execute in parallel. Show activity on this post.
Basically the thread pool behind all the various Parallel library functions, will work out an optimum number of threads to use. The number of physical processor cores forms only part of the equation. There is NOT a simple one to one relationship between the number of cores and the number of threads spawned.
No, it won't start 1000 threads - yes, it will limit how many threads are used. Parallel Extensions uses an appropriate number of cores, based on how many you physically have and how many are already busy.
Is it possible to limit threads count for C++ 17 parallel
for_each
?
No, at least not in C++17.
However, there is a proposal for executors
in a standard to come, which basically gives you the ability to influence the execution context (in terms of location and time) for the high-level STL algorithm interface:
thread_pool pool{ std::thread::hardware_concurrency() };
auto exec = pool.executor();
std::for_each(std::execution::par.on(exec), begin(data), end(data), some_operation);
Up to then, you have to either trust your compiler vendor that he knows what is best for the overall performance, as e.g. the developers of Visual Studio state:
Scheduling in our implementation is handled by the Windows system thread pool. The thread pool takes advantage of information not available to the standard library, such as what other threads on the system are doing, what kernel resources threads are waiting for, and similar. It chooses when to create more threads, and when to terminate them. It’s also shared with other system components, including those not using C++.
The other option would be to give up on solely relying on the standard library and use STL implementations which already feature the new proposal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With