When I run my multi-threaded code, the system (linux) sometimes moves the threads from one processor to another. As I have as many threads as I have processors, it invalidates caches for no good reasons and it confuses my tracing activities.
Do you know how to bind threads to processors, and why does a system would do this ?
On a system with multiple processors or CPU cores (as is common with modern processors), multiple processes or threads can be executed in parallel. On a single processor, though, it is not possible to have processes or threads truly executing at the same time.
Threads are the virtual components or codes, which divides the physical core of a CPU into virtual multiple cores. A single CPU core can have up-to 2 threads per core. For example, if a CPU is dual core (i.e., 2 cores) it will have 4 threads.
Processor affinity or CPU pinning enables applications to bind or unbind a process or a thread to a specific core or to a range of cores or CPUs. The operating system ensures that a given thread executes only on the assigned core(s) or CPU(s) each time it is scheduled, if it was pinned to a core.
Cpumasks is a special way provided by the Linux kernel to store information about CPUs in the system. The relevant source code and header files which contains API for Cpumasks manipulation: include/linux/cpumask.
Use sched_setaffinity
(this is Linux-specific).
Why would a scheduler switch threads between different processors? Well, imagine that your thread last ran on processor 1 and is currently waiting to be scheduled for execution again. In the meantime, a different thread is currently running on processor 1, but processor 2 is free. In this situation, it's reasonable for the scheduler to switch your thread to processor 2. However, a sophisticated scheduler will try to avoid "bouncing" a thread between processors more than necessary.
You can do this from bash. There is a wonderful taskset
command I acquainted in this question (you may also find valuable discussion on how scheduler should operate there). The command takes a pid of a process and binds it to the specific processor(s).
taskset -c 0 -p PID
binds the process with PID to processor (core) number 0.
What does it have to do with threads? To each thread is assigned an identifier with the same rights as pid, also known as "tid". You can get it with gettid
syscall. Or you can watch it, for example, in top
program by pressing H (some processes will split to many seemingly equal entries with different pids---those are threads).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With