Trivia
Usually, when I want to write a multi-threaded program in C++, I ask the hardware regarding the number of supported concurrent threads as shown in what follows:
unsigned int numThreads = std::thread::hardware_concurrency();
This returns the total number of supported concurrency. Hence, if we have 2 CPUs each of which can support 12 threads, numThreads
will be equal to 24.
Problem
Recently I used numactl
to enforce a program to run on ONE CPU ONLY.
numactl -N 1 ./a.out
The problem is that std::thread::hardware_concurrency()
returns 24 even when I run it with numactl -N 1
. However, under such settings the output of nproc
is 12.
numactl -N 1 nproc --> output = 12
Question
Perhaps std::thread::hardware_concurrency()
is not designed to support such a scenario. That's not my concern. My question is, what is the best practice to get the supported number of threads when I want to run my program with numactl
.
Further information
In case you haven't dealt with numactl
, it can be used to run a process using a NUMA policy. For example, you can use it to enforce your program to be ran on one CPU only. The usage for such a case is shown above.
The maximum number of threads per process is 512. The maximum number of threads can be retrieved at compilation time using the PTHREAD_THREADS_MAX symbolic constant defined in the pthread.
How to Retrieve Maximum Thread Count. The kernel parameter threads-max controls the maximum number of threads. This parameter is defined in the file /proc/sys/kernel/threads-max. Here, the output 63704 indicates that the kernel can execute a maximum of 63,704 threads.
There is nothing in the C++ standard that limits number of threads. However, OS will certainly have a hard limit. Having too many threads decreases the throughput of your application, so it's recommended that you use a thread pool.
On Windows machines, there's no limit specified for threads. Thus, we can create as many threads as we want, until our system runs out of available system memory.
You'll have to use OS specific calls to inquire about the limitations that it imposes on your process.
hardware_concurrency
potentially returns a hint to the number of threads supported (by your hardware), or may return 0. The OS can limit your process to fewer threads than this number (or could potentially use more), whether using tools like numactl
, normal scheduling, or some other means. There is always the possibility that some process or user will change the allowable CPU set, which can effect the available concurrency. A typical C++ program is not expected to have to concern itself with these details, particularly since changes in the number of available threads are often transient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With