Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transition from real to protected mode in the Linux kernel

I am currently studying low level organization of operating systems. In order to achive that I am trying to understand how Linux kernel is loaded.

A thing that I cannot comprehend is the transition from 16-bit (real mode) to 32-bit (protected mode). It happens in this file.

The protected_mode_jump function performs various auxiliary calculations for 32-bit code that is executed later, then enables PE bit in the CR0 reguster

    movl    %cr0, %edx
    orb $X86_CR0_PE, %dl    # Protected mode
    movl    %edx, %cr0

and after that performs long jump to 32-bit code:

    # Transition to 32-bit mode
    .byte   0x66, 0xea      # ljmpl opcode
2:  .long   in_pm32         # offset
    .word   __BOOT_CS       # segment

As far as I understand in_pm32 is the address of the 32-bit function which is declared right below the protected_mode_jump:

    .code32
    .section ".text32","ax"
GLOBAL(in_pm32)
    # some code
    # ...
    # some code
ENDPROC(in_pm32)

The __BOOT_CS sector base is 0 (the GDT is set beforehand here), so that means that offset should be basically absolute address of the in_pm32 function.

That's the issue. During machine code generation the assembler/linker should not know the absolute address of the in_pm32 function, because it does not know where it will be loaded in the memory in the real mode (various bootloaders can occupy various amounts of space, and the real mode kernel is loaded just after a bootloader).

Moreover, the linker script (setup.ld in the same folder) sets the origin of the code as 0, so seems like in_pm32 address will be the offset from the beginning of the real mode kernel. It should work just fine with 16-bit code because CS register is set properly, but when long jump happens the CPU is already in protected mode, so a relative offset should not work.

So my question: Why does the long jump in Protected Mode (.byte 0x66, 0xea) sets the proper code position if the offset (.long in_pm32) is relative?

Seems like I am missing something really important.

like image 967
Alexander Avatar asked Jan 21 '17 10:01

Alexander


People also ask

Does grub switch to protected mode?

GRUB will boot us into protected mode, aka 32-bit mode (similar to how xv6 bootloader starts in 16-bit real mode GRUB will be loaded by the BIOS and will switch into protected 32-bit mode for us).

Why does Linux kernel run in protected mode?

The kernel space is accessed protected so that user applications can not access it directly, while user space can be directly accessed from code running in kernel mode.

Does the kernel run in real mode?

The kernel image is split into two pieces: The real-mode kernel code, which is small and can be loaded within the 640kB threshold of available memory; The rest of the kernel, which runs in protected mode and is loaded after the first megabyte of memory.


1 Answers

It appears that your question really is about how the offset stored at the following line can possibly work since it is relative to the start of the segment, not necessarily the start of memory:

 2:  .long   in_pm32         # offset

It is true that in_pm32 is relative to the offset the linker script uses. In particular the linker script has:

. = 0;
.bstext     : { *(.bstext) }
.bsdata     : { *(.bsdata) }

. = 495;
.header     : { *(.header) }
.entrytext  : { *(.entrytext) }
.inittext   : { *(.inittext) }
.initdata   : { *(.initdata) }
__end_init = .;

.text       : { *(.text) }
.text32     : { *(.text32) } 

The Virtual Memory Address is set to zero (and subsequently 495), so one would think that anything in the .text32 section will have to be fixed in low memory. This would be a correct observation had it not been for these instructions in protected_mode_jump:

    xorl    %ebx, %ebx
    movw    %cs, %bx
    shll    $4, %ebx
    addl    %ebx, 2f

[snip]

    # Transition to 32-bit mode
    .byte   0x66, 0xea      # ljmpl opcode
2:  .long   in_pm32         # offset
    .word   __BOOT_CS       # segment

There is a manually encoded FAR JMP at the end that is used to set the CS selector to a 32-bit code descriptor to finalize the transition to 32-bit protected mode. But the key thing to observe are in these lines specifically:

    xorl    %ebx, %ebx
    movw    %cs, %bx
    shll    $4, %ebx
    addl    %ebx, 2f

This takes the value in CS and shifts it left by 4 bits (multiply by 16) and then adds it to the value stored at label 2f. This is the way you take a real mode segment:offset pair and convert it into a linear address (which is the same as a physical address in this case). Label 2f is in fact the offset in_pm32 in this line:

2:  .long   in_pm32         # offset

When those instruction are complete, the long word value in_pm32 in the FAR JMP will be adjusted (at run time) by adding the linear address of the current real mode code segment to the value in_pm32. This .long (DWORD) value will be replaced with (CS<<4)+in_pm32.

This code was designed to be relocatable to any real mode segment. The final linear address is computed at run time before the FAR JMP. This is in effect self-modifying code.

like image 84
Michael Petch Avatar answered Sep 27 '22 18:09

Michael Petch