Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why linux kernel use trap gate to handle divide_error exception?

In kernel 2.6.11.5, divide zero exception handler is set up as:

set_trap_gate(0,&divide_error);

According to "Understanding The Linux Kernel", Intel trap gate cannot be accessed by a User Mode process. But it's quite possible that a user mode process also generate a divide_error. So why Linux implement it in this way?

[Edit] I think that the question is still open, since set_trap_gate() sets DPL value of IDT entry to 0, which means only CPL=0 (read kernel) code can execute it, so it's unclear to me how this handler may be called from the user mode:

#include<stdio.h>

int main(void)
{
    int a = 0;
    int b = 1;

    b = b/a;

    return b;
}

which was compiled with gcc div0.c. And the output of ./a.out is:

Floating point exception (core dumped)

So it doesn't look like this was handled by the division by 0 trap code.

like image 208
user1063294 Avatar asked Dec 16 '11 06:12

user1063294


People also ask

How are exceptions handled in Linux?

Most exceptions issued by the CPU are interpreted by Linux as error conditions. When one of them occurs, the kernel sends a signal to the process that caused the exception to notify it of an anomalous condition.

How interrupts and exceptions are handled in Linux?

As we explained earlier, most exceptions are handled simply by sending a Unix signal to the process that caused the exception. The action to be taken is thus deferred until the process receives the signal; as a result, the kernel is able to process the exception quickly.

What happens when the kernel encounters an exception?

When an exception occurs, the kernel first-level exception handler gets control. The first-level exception handler determines what type of exception has occurred and saves information necessary for handling the specific type of exception.

Is a processor detected exception?

Processor-detected exceptions. Generated when the CPU detects an anomalous condition while executing an instruction. These are further divided into three groups, depending on the value of the eip register that is saved on the Kernel Mode stack when the CPU control unit raises the exception.


2 Answers

I have the Linux kernel 3.7.1 sources on the hands, and due to this I will try to provide the answer to your question on the base of those sources. What we have in the code. In the arch\x86\kernel\traps.c we have function early_trap_init() where the next code line can be found:

set_intr_gate(X86_TRAP_DE, &divide_error);

As we can see the set_trap_gate() was replaced by set_intr_gate(). If in the next turn we expand this call we will achieve:

_set_gate(X86_TRAP_DE, GATE_INTERRUPT, &divide_error, 0, 0, __KERNEL_CS);

_set_gate is a routine that is responsible for two things:

  1. Constructing of the IDT descriptor
  2. Installing of the constructed descriptor into the target cell in the IDT descriptors array. The second one is just memory copying and isn't interesting for us. But if we will look at how it constructs descriptor from the supplied parameters we will see:

    struct desc_struct{
        unsigned int a;
        unsigned int b;
    };
    
    desc_struct gate;
    
    gate->a = (__KERNEL_CS << 16) | (&divide_error & 0xffff);
    gate->b = (&divide_error & 0xffff0000) | (((0x80 | GATE_INTERRUPT | (0 << 5)) & 0xff) << 8); 
    

Or finally

gate->a = (__KERNEL_CS << 16) | (&divide_error & 0xffff);
gate->b = (&divide_error & 0xffff0000) | (((0x80 | 0xE | (0 << 5)) & 0xff) << 8);

As we can see at the end of the descriptor construction we will have the next 8-bytes data structure in memory

[0xXXXXYYYY][0xYYYY8E00], where X denotes digits of kernel code segment selector number, and Y denotes digits of address of the divide_error routine.

These 8-bytes data structure is a processor defined interrupt descriptor. It is used by processor to identify what actions must be taken in reply to the acceptance of interrupt with particular vector. Let’s now look to the format of the interrupt descriptor defined by Intel for x86 processors family:

                              80386 INTERRUPT GATE
31                23                15                7                0
+-----------------+-----------------+---+---+---------+-----+-----------+
|           OFFSET 31..16           | P |DPL|  TYPE   |0 0 0|(NOT USED) |4
|-----------------------------------+---+---+---------+-----+-----------|
|             SELECTOR              |           OFFSET 15..0            |0
+-----------------+-----------------+-----------------+-----------------+

In this format the pair of SELECTOR:OFFSET defines the address of the function (in the long format) that will take control in reply to the interrupt acceptance. In our case this is __KERNEL_CS:divide_error, where the divide_error() is actual handler of the Division By Zero exception. P flag specifies is that descriptor should be considered as a valid descriptor that was correctly setup by OS and in our case it in raised state. DPL - specify the security rings on which the divide_error() function can be triggered by using soft interrupts. Some background needed to understand the role of that field.

In general there are three kinds of interrupt sources:

  1. External device that requests a service from the OS.
  2. Processor itself, when found that it income into the abnormal state requesting the OS to help it to get out from that state.
  3. Program executing on the processor under the OS control, which requests some special service from the OS.

The last case has special support from the processor in the form of dedicated instruction int XX. Each time when the program wants the OS service, it setup parameters that describes request and issue int instruction with parameter that describes the interrupt vector, which is used by OS for service providing. Interrupts generated by issuing of the int instruction called soft interrupts. So here, the processor takes DPL field into account only when it handle soft interrupts, a completely ignore them in the case of interrupts generated by processor itself or by external devices. DPL is a very important feature, because it prohibits applications from simulating devices, and by this imply to the system behavior.

Imagine for example that some application will make something like this:

for(;;){
    __asm int 0xFF; 

  //where 0xFF is vector used by system timer, to notify the kernel that the 
   another one timer tick was occurred
}

In that case time in your computer will go much faster then in real life, then you expect and your system expect. As result your system will misbehaves very strongly. As you can see the processor and external devices are considered as trusted, but it is not a case for user mode applications. In our case of Division By Zero exception, Linux specify that this exception can be triggered by soft interrupt only from the ring 0, or in other words, only from the kernel. As result, if the int 0 instruction will be executed in the kernel space, processor will pass control to the divide_error() routine. If the same instruction will be executed in the user space, kernel will this as a protection violation and will pass control to the General Protection Fault exception handler (this is a default action for all invalid soft interrupts). But if the Division By Zero exception will be generated by processor itself tried to divide some value by zero, control will be switched to the divide error() routine regardless of the space where incorrect division was occurred. In general it looks like it won't be a big harm to allow application to trig Division By Zero exception by soft interrupt. But for the first it will be an ugly design and for the second some logic can be behind the scene, which relies to the fact that Division By Zero exception can be generated only by actual incorrect division operation.

TYPE field specifies the auxiliary actions that must be taken by processor in reply to the interrupt acceptance. In practice only two types of exception descriptors is used: interrupt descriptor and trap descriptor. They differ only in one aspect. Interrupt descriptor forces the processor to disable future interrupt acceptance and trap descriptor doesn't. Honestly, I have no idea why Linux kernel decided to use the interrupt descriptor for Division By Zero exception handling. The trap descriptor sounds more reasonable for me.

And last note in regard to the confusing output of test program

Floating point exception (core dumped)  

By historical reasons, Linux kernel replies to the Division By Zero exception by sending SIGFPE (read SIGnal Floating Point Exception) signal to the process attempted to divide by zero. Yes, not SIGDBZ (read SIGnal Division By Zero). I know this is confusing enough. The reason of such behavior is that Linux mimics original UNIX behavior (I think that this behavior was frozen in the POSIX) and original UNIX some why consider "Division By Zero" exception as a "Floating Point Exception". I don't know why.

like image 133
ZarathustrA Avatar answered Oct 10 '22 19:10

ZarathustrA


DPL bit in IDT is looked at only when software interrupt is called with the int instruction. Division by zero is a software interrupt triggered by the CPU and thus has DPL has no effect in this case

like image 29
Alex Kreimer Avatar answered Oct 10 '22 18:10

Alex Kreimer