I wrote a test program which consists of just an infinite loop with some computations inside, and performs no I/O operations. I tried starting two instances of the program, one with a high niceness value, and the other with a low niceness value:
sudo nice -n 19 taskset 1 ./test
sudo nice -n -20 taskset 1 ./test
The taskset command ensures that both programs execute on the same core. Contrary to my expectation, top reports that both programs get about 50% of the computation time. Why is that? Does the nice command even have an effect?
The behavior you are seeing is almost certainly because of the autogroup feature that was added in Linux 2.6.38 (in 2010). Presumably when you described running the two commands, they were run in different terminal windows. If you had run them in the same terminal window, then you should have seen the nice value have an effect. The rest of this answer elaborates the story.
The kernel provides a feature known as autogrouping to improve interactive desktop performance in the face of multiprocess, CPU-intensive workloads such as building the Linux kernel with large numbers of parallel build processes (i.e., the make(1) -j
flag).
A new autogroup is created when a new session is created
via setsid(2)
; this happens, for example, when a new terminal window is started. A new process created by fork(2)
inherits its
parent's autogroup membership. Thus, all of the processes in a
session are members of the same autogroup.
When autogrouping is enabled, all of the members of an autogroup are placed in the same kernel scheduler "task group". The Linux kernel scheduler employs an algorithm that equalizes the distribution of CPU cycles across task groups. The benefits of this for interactive desktop performance can be described via the following example.
Suppose that there are two autogroups competing for the same CPU
(i.e., presume either a single CPU system or the use of taskset(1)
to confine all the processes to the same CPU on an SMP system).
The first group contains ten CPU-bound processes from a kernel
build started with make -j10
. The other contains a single
CPU-bound process: a video player. The effect of autogrouping is that
the two groups will each receive half of the CPU cycles. That is,
the video player will receive 50% of the CPU cycles, rather than
just 9% of the cycles, which would likely lead to degraded video
playback. The situation on an SMP system is more complex, but the
general effect is the same: the scheduler distributes CPU cycles
across task groups such that an autogroup that contains a large
number of CPU-bound processes does not end up hogging CPU cycles
at the expense of the other jobs on the system.
The nice value and group scheduling
When scheduling non-real-time processes (e.g., those scheduled
under the default SCHED_OTHER
policy), the
scheduler employs a technique known as "group scheduling", under which threads are scheduled in "task groups".
Task groups are formed in the various circumstances, with the relevant case here being autogrouping.
If autogrouping is enabled, then all of the threads that are
(implicitly) placed in an autogroup (i.e., the same session, as
created by setsid(2)
) form a task group. Each new autogroup is
thus a separate task group.
Under group scheduling, a thread's nice value has an effect for
scheduling decisions only relative to other threads in the same
task group. This has some surprising consequences in terms of the
traditional semantics of the nice value on UNIX systems. In particular, if autogrouping is enabled (which is the default in various Linux distributions), then
employing nice(1)
on a process has an effect
only for scheduling relative to other processes executed in the
same session (typically: the same terminal window).
Conversely, for two processes that are (for example) the sole CPU-bound processes in different sessions (e.g., different terminal windows, each of whose jobs are tied to different autogroups), modifying the nice value of the process in one of the sessions has no effect in terms of the scheduler's decisions relative to the process in the other session. This presumably is the scenario you saw, though you don't explicitly mention using two terminal windows.
If you want to prevent autogrouping interfering with the traditional nice
behavior as described here, you can disable the feature
echo 0 > /proc/sys/kernel/sched_autogroup_enabled
Be aware though that this will also have the effect of disabling the benefits for desktop interactivity that the autogroup feature was intended to provide (see above).
The autogroup nice value
A process's autogroup membership can be viewed via
the file /proc/[pid]/autogroup
:
$ cat /proc/1/autogroup
/autogroup-1 nice 0
This file can also be used to modify the CPU bandwidth allocated to an autogroup. This is done by writing a number in the "nice" range to the file to set the autogroup's nice value. The allowed range is from +19 (low priority) to -20 (high priority).
The autogroup nice setting has the same meaning as the process nice value, but applies to distribution of CPU cycles to the autogroup as a whole, based on the relative nice values of other autogroups. For a process inside an autogroup, the CPU cycles that it receives will be a product of the autogroup's nice value (compared to other autogroups) and the process's nice value (compared to other processes in the same autogroup).
I put together a test.c that just does:
for(;;)
{
}
And then ran it with your nice's. I didn't run a different sudo for each one, but rather sudo'd an interactive shell and ran them both from there. I used two &'s.
I got one ./test hitting my CPU hard, and one barely touching it.
Naturally, the system still felt quite responsive; it takes a lot of CPU-hogging processes on modern processors to get so much load you can "feel" it.
That stands in contrast to I/O-hogging processes and memory-hogging processes; in these cases a single greedy process can make a system painful to use.
I'd guess either your system has a relatively unique priority-related bug (or subtlety), or there's something up with your methodology.
I ran my test on an Ubuntu 11.04 system.
I'm assuming that there's a &
missing at the end of the command line. Otherwise, the second line won't run until the first completes.
While both processes are running, use something like top
and make sure that they each have the nice value that you assigned.
What happens if you launch the processes using only taskset
and then adjust their priority with renice
after they are running?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With