I'm reading Linux source code to learn how scheduling works. I learn that in a preemptible kernel (CONFIG_PREEMPT
is set), there is a chance for preemption after returning to kernel-space from interrupt handler by calling preempt_schedule_irq
.
However, I also find the following code snippet in preempt_schedule_irq
do {
preempt_disable();
local_irq_enable(); //why enable interrupt here?
__schedule(true); //interrupt would be disabled inside it
local_irq_disable();
sched_preempt_enable_no_resched();
} while (need_resched());
There is a local_irq_enable()
call inside it and this kind of confuses me. Why do we need to enable interrupt here since at the start of __schedule
it would disabled again?.
My humble guess is that this gives a chance to processes with higher priority to be scheduled first. However, it doesn't make sense because the preemption is already disabled in preempt_schedule_irq
, even if there is an interrupt, there would not be a preemption reschedule.
So what on earth is the point in preempting the scheduling procedure here? I think I must have missed something but I don't figure out.
Short answer: Because interrupts should be enabled as much as possible, only disabled to protect minimal critical sections. Arbitrarily extending the disabling beyond your critical section into non-critical sections of functions you're calling because you're assuming that at some point that function will disable them, is bad design.
Why doesn't __schedule() disable interrupts as it's very first instruction? Because it doesn't need to, the code at the start of __schedule() isn't a critical section so explicitly disabling interrupts before it would be a waste. The writer of __schedule() went out of their way to maximize the time when interrupts can be handled, why ignore that opportunity by not enabling interrupts?
Also, you have no guarantees about what __schedule() might do in the future. Since the start of __schedule() isn't a critical section, you have no guarantees that more stuff won't be added before the interrupt disabling. Remember, the person who's going to be making changes to __schedule() shouldn't have to consider that one of the callers decided to leave their interrupts disabled relying on the fact that __schedule() turns them off pretty soon anyway. __schedule() has no regard at all about interrupt status when it's called.
You should disable/enable interrupts around the critical sections of your code, not rely on the inner mechanics of some other function you're calling and hoping it doesn't change.
If you look through the history of the scheduler, you'll see that the code preceding the interrupt disabling has changed over time. Digging through the commits you see that the "sti" to enable interrupts was present since the very first commit to implement preemption in the kernel going back all the way to 2.5: https://github.com/schwabe/tglx-history/blob/ec332cd30cf1ccde914a87330ff66744414c8d24/arch/i386/kernel/entry.S#L235
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With