I'm trying to read Linux source code(2.6.11)
In the exception handler, at entry.s, error_code:
movl $(__USER_DS), %ecx
movl %ecx, %ds
movl %ecx, %es
I don't know why loading user data segment here. Since it is supposed to be entering the exception handler code which runs in the kernel mode, the selector is supposed to be __KERNEL_DS.
I checked other versions of the code, they do the same thing specifically too at this place.
If the exception handler is entered with ds
and es
already set to the data segment, it makes no difference except for maybe a microsecond of delay. Exception handlers don't usually need to be fast.
But what might cause going to the exception handler? Could it have been because a bad value was loaded into a segment register and then referenced? In such cases it is important for the code to establish a safe environment. cs
is set by the exception invocation. To be bulletproof, ss
and esp
should be set up too.
Followup:
Looking at the 2-6.22.18 kernel for i386, I don't see exactly that:
error_code: /* the function address is in %fs's slot on the stack */
pushl %es
... pushes %ds, %eax, %ebp, %edi, %esi, %edx, %ecx, %ebx, %fs
... along with pseudo-ops to manage stack frame layout
movl $(__KERNEL_PERCPU), %ecx
movl %ecx, %fs
popl %ecx // retrieves saved %fs
... sets up registers for the exception function
The symbol __KERNEL_PERCPU
is a macro defined (in include/asm-i386/segment.h
) as 0
for non-SMP machines and (GDT_ENTRY_PERCPU * 8)
for SMPs. The 8 is for the GDT entry size (I think) and the GDT_ENTRY_PERCPU
relates to the entries in the per-CPU GDT. Its value is <base> + 15
which the comments indicate is "default user DS", so it is, in fact, the same thing.
The kernel data segment is accessed through fs
and ss
. Much kernel data access is on the stack. By keeping the user mode descriptors accessed through ds
, very little loading of segment registers is needed.
In the entry.s:
#define RESTORE_ALL
RESTORE_REGS
addl $4, %esp;
1: iret;
.section .fixup,"ax";
2: sti;
movl $(__USER_DS), %edx;
movl %edx, %ds;
movl %edx, %es;
movl $11,%eax;
call do_exit;
.previous;
.section __ex_table,"a";
.align 4;
.long 1b,2b;
.previous
This macro will be called at the end of exception/interrupt/syscalls. The fix code set ds&es to USER_DS, which shows that iret itself will raise an exception once the ds&es's DPL is not 3(user privilege).
So linux set ds&es to USER_DS at the very beginning of exception/interrupt/syscalls to avoid this exception.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With