Desired behaviour: run a multi-threaded Linux program on a set of cores which have been isolated using isolcpus
.
Here's a small program we can use as an example multi-threaded program:
#include <stdio.h>
#include <pthread.h>
#include <err.h>
#include <unistd.h>
#include <stdlib.h>
#define NTHR 16
#define TIME 60 * 5
void *
do_stuff(void *arg)
{
int i = 0;
(void) arg;
while (1) {
i += i;
usleep(10000); /* dont dominate CPU */
}
}
int
main(void)
{
pthread_t threads[NTHR];
int rv, i;
for (i = 0; i < NTHR; i++) {
rv = pthread_create(&threads[i], NULL, do_stuff, NULL);
if (rv) {
perror("pthread_create");
return (EXIT_FAILURE);
}
}
sleep(TIME);
exit(EXIT_SUCCESS);
}
If I compile and run this on a kernel with no isolated CPUs, then the threads are spread out over my 4 CPUs. Good!
Now if I add isolcpus=2,3
to the kernel command line and reboot:
taskset -c 0,1
has the same effect. Good.taskset -c 2,3
causes all threads to go onto the same core (either core 2 or 3). This is undesired. Threads should distribute over cores 2 and 3. Right?This post describes a similar issue (although the example given is farther away from the pthreads API). The OP was happy to workaround this by using a different scheduler. I'm not certain this is ideal for my use-case however.
Is there a way to have the threads distributed over the isolated cores using the default scheduler?
Is this a kernel bug which I should report?
EDIT:
The right thing does indeed happen if you use a real-time scheduler like the fifo scheduler. See man sched
and man chrt
for details.
The taskset command is used to set or retrieve the CPU affinity of a running process given its pid, or to launch a new command with a given CPU affinity. CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system.
As per Wikipedia, Processor affinity, or CPU pinning or “cache affinity”, enables the binding and unbinding of a process or a thread to a central processing unit (CPU) or a range of CPUs, so that the process or thread will execute only on the designated CPU or CPUs rather than any CPU.
Thread affinity provides a way for an application thread to tell the OS scheduler exactly where its threads can (and would like to) run. The scheduler in turn does not have to spend a lot of time load balancing the system because application threads are already where they need to be.
From the Linux Kernel Parameter Doc:
This option can be used to specify one or more CPUs to isolate from the general SMP balancing and scheduling algorithms.
So this options would effectively prevent scheduler doing thread migration from one core to another less contended core (SMP balancing). As typical isolcpus are used together with pthread affinity control to pin threads with knowledge of CPU layout to gain predictable performance.
https://www.kernel.org/doc/Documentation/kernel-parameters.txt
--Edit--
Ok I see why you are confused. Yeah personally I would assume consistent behavior on this option. The problem lies around two functions, select_task_rq_fair and select_task_rq_rt, which is responsible for selecting new run_queue (which is essentially selecting which next_cpu to run on). I did a quick trace (Systemtap) of both functions, for CFS it would always return the same first core in the mask; for RT, it would return other cores. I haven't got a chance to look into the logic in each selection algorithm but you can send an email to the maintainer in Linux devel mailing list for fix.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With