In a simple experiment I set NOHZ=OFF
and used printk()
to print how often the do_timer()
function gets called. It gets called every 10 ms on my machine.
However if NOHZ=ON
then there is a lot of jitter in the way do_timer()
gets called. Most of the times it does get called every 10 ms but there are times when it completely misses the deadlines.
I have researched about both do_timer()
and NOHZ. do_timer()
is the function responsible for updating jiffies
value and is also responsible for the round robin scheduling of the processes.
NOHZ feature switches off the hi-res timers on the system.
What I am unable to understand is how can hi-res timers affect the do_timer()
? Even if hi-res hardware is in sleep state the persistent clock is more than capable to execute do_timer()
every 10 ms. Secondly if do_timer()
is not executing when it should, that means some processes are not getting their timeshare when they should ideally be getting it. A lot of googling does show that for many people many applications start working much better when NOHZ=OFF
.
To make long story short, how does NOHZ=ON
affect do_timer()
?
Why does do_timer()
miss its deadlines?
First lets understand what is a tickless kernel
( NOHZ=On
or CONFIG_NO_HZ
set ) and what was the motivation of introducing it into the Linux Kernel from 2.6.17
From http://www.lesswatts.org/projects/tickless/index.php,
Traditionally, the Linux kernel used a periodic timer for each CPU. This timer did a variety of things, such as process accounting, scheduler load balancing, and maintaining per-CPU timer events. Older Linux kernels used a timer with a frequency of 100Hz (100 timer events per second or one event every 10ms), while newer kernels use 250Hz (250 events per second or one event every 4ms) or 1000Hz (1000 events per second or one event every 1ms).
This periodic timer event is often called "the timer tick". The timer tick is simple in its design, but has a significant drawback: the timer tick happens periodically, irrespective of the processor state, whether it's idle or busy. If the processor is idle, it has to wake up from its power saving sleep state every 1, 4, or 10 milliseconds. This costs quite a bit of energy, consuming battery life in laptops and causing unnecessary power consumption in servers.
With "tickless idle", the Linux kernel has eliminated this periodic timer tick when the CPU is idle. This allows the CPU to remain in power saving states for a longer period of time, reducing the overall system power consumption.
So reducing power consumption was one of the main motivations of the tickless kernel. But as it goes, most of the times, Performance takes a hit with decreased power consumption. For desktop computers, performance is of utmost concern and hence you see that for most of them NOHZ=OFF
works pretty well.
In Ingo Molnar's own words
The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer interrupts: if there is no timer to be expired for say 1.5 seconds when the system goes idle, then the system will stay totally idle for 1.5 seconds. This should bring cooler CPUs and power savings: on our (x86) testboxes we have measured the effective IRQ rate to go from HZ to 1-2 timer interrupts per second.
Now, lets try to answer your queries-
What I am unable to understand is how can hi-res timers affect the do_timer ?
If a system supports high-res timers, timer interrupts can occur more frequently than the usual 10ms
on most systems. i.e these timers try to make the system more responsive by leveraging the system capabilities and by firing timer interrupts even faster, say every 100us
. So with NOHZ
option, these timers are cooled down and hence the lower execution of do_timer
Even if hi-res hardware is in sleep state the persistent clock is more than capable to execute do_timer every 10ms
Yes it is capable. But the intention of NOHZ
is exactly the opposite. To prevent frequent timer interrupts!
Secondly if do_timer is not executing when it should that means some processes are not getting their timeshare when they should ideally be getting it
As caf noted in the comments, NOHZ
does not cause processes to get scheduled less often, because it only kicks in when the CPU is idle - in other words, when no processes are schedulable. Only the process accounting stuff will be done at a delayed time.
Why does do_timer miss it's deadlines ?
As elaborated, it is the intended design of NOHZ
I suggest you go through the tick-sched.c kernel sources as a starting point. Search for CONFIG_NO_HZ
and try understanding the new functionality added for the NOHZ
feature
Here is one test performed to measure the Impact of a Tickless Kernel
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With