I am currently studying low level organization of operating systems. In order to achive that I am trying to understand how Linux kernel is loaded.
A thing that I cannot comprehend is the transition from 16-bit (real mode) to 32-bit (protected mode). It happens in this file.
The protected_mode_jump
function performs various auxiliary calculations for 32-bit code that is executed later, then enables PE
bit in the CR0
reguster
movl %cr0, %edx
orb $X86_CR0_PE, %dl # Protected mode
movl %edx, %cr0
and after that performs long jump to 32-bit code:
# Transition to 32-bit mode
.byte 0x66, 0xea # ljmpl opcode
2: .long in_pm32 # offset
.word __BOOT_CS # segment
As far as I understand in_pm32
is the address of the 32-bit function which is declared right below the protected_mode_jump
:
.code32
.section ".text32","ax"
GLOBAL(in_pm32)
# some code
# ...
# some code
ENDPROC(in_pm32)
The __BOOT_CS
sector base is 0 (the GDT is set beforehand here), so that means that offset should be basically absolute address of the in_pm32
function.
That's the issue. During machine code generation the assembler/linker should not know the absolute address of the in_pm32
function, because it does not know where it will be loaded in the memory in the real mode (various bootloaders can occupy various amounts of space, and the real mode kernel is loaded just after a bootloader).
Moreover, the linker script (setup.ld
in the same folder) sets the origin of the code as 0, so seems like in_pm32
address will be the offset from the beginning of the real mode kernel. It should work just fine with 16-bit code because CS
register is set properly, but when long jump happens the CPU is already in protected mode, so a relative offset should not work.
So my question:
Why does the long jump in Protected Mode (.byte 0x66, 0xea
) sets the proper code position if the offset (.long in_pm32
) is relative?
Seems like I am missing something really important.
GRUB will boot us into protected mode, aka 32-bit mode (similar to how xv6 bootloader starts in 16-bit real mode GRUB will be loaded by the BIOS and will switch into protected 32-bit mode for us).
The kernel space is accessed protected so that user applications can not access it directly, while user space can be directly accessed from code running in kernel mode.
The kernel image is split into two pieces: The real-mode kernel code, which is small and can be loaded within the 640kB threshold of available memory; The rest of the kernel, which runs in protected mode and is loaded after the first megabyte of memory.
It appears that your question really is about how the offset stored at the following line can possibly work since it is relative to the start of the segment, not necessarily the start of memory:
2: .long in_pm32 # offset
It is true that in_pm32
is relative to the offset the linker script uses. In particular the linker script has:
. = 0;
.bstext : { *(.bstext) }
.bsdata : { *(.bsdata) }
. = 495;
.header : { *(.header) }
.entrytext : { *(.entrytext) }
.inittext : { *(.inittext) }
.initdata : { *(.initdata) }
__end_init = .;
.text : { *(.text) }
.text32 : { *(.text32) }
The Virtual Memory Address is set to zero (and subsequently 495), so one would think that anything in the .text32
section will have to be fixed in low memory. This would be a correct observation had it not been for these instructions in protected_mode_jump
:
xorl %ebx, %ebx
movw %cs, %bx
shll $4, %ebx
addl %ebx, 2f
[snip]
# Transition to 32-bit mode
.byte 0x66, 0xea # ljmpl opcode
2: .long in_pm32 # offset
.word __BOOT_CS # segment
There is a manually encoded FAR JMP at the end that is used to set the CS selector to a 32-bit code descriptor to finalize the transition to 32-bit protected mode. But the key thing to observe are in these lines specifically:
xorl %ebx, %ebx
movw %cs, %bx
shll $4, %ebx
addl %ebx, 2f
This takes the value in CS and shifts it left by 4 bits (multiply by 16) and then adds it to the value stored at label 2f
. This is the way you take a real mode segment:offset pair and convert it into a linear address (which is the same as a physical address in this case). Label 2f
is in fact the offset in_pm32
in this line:
2: .long in_pm32 # offset
When those instruction are complete, the long word value in_pm32
in the FAR JMP will be adjusted (at run time) by adding the linear address of the current real mode code segment to the value in_pm32
. This .long
(DWORD) value will be replaced with (CS<<4)+in_pm32.
This code was designed to be relocatable to any real mode segment. The final linear address is computed at run time before the FAR JMP. This is in effect self-modifying code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With