I'm new to operating system development and I'm curious about a problem I've run into while developing my own bootloader. My operating system is going to be written in assembly and will run in 16-bit real mode.
I know what the stack is and I have the impression that it grows downwards into the memory. Correct me if I'm mistaken. I know how to load a basic kernel into the memory from a floppy disc and I don't believe that is the problem.
The problem I'm running into is that I'm unsure where to place the stack and load my kernel into memory. I've tried creating my stack like this and I'm running into problems:
mov ax, 0x0000
mov ss, ax
mov sp, 0xFFFF
I'm loading my kernel at 0x1000:0x0000
. When I PUSH and later POP the volatile registers in my print
function, my kernel just hangs the second time I do call print
. This is my print
function:
print:
push ax
push bx
push cx
mov al, [si]
cmp al, 0
je p_Done
cmp al, 9
je p_Tab
mov ah, 0xE
int 0x10
cmp al, 10
je p_NewLine
p_Return:
inc si
jmp print
p_Tab:
xor cx, cx
p_Tab_Repeat:
cmp cx, 8
je p_Return
mov ah, 0xE
mov al, " "
int 0x10
inc cx
jmp p_Tab_Repeat
p_NewLine:
xor bx, bx
mov ah, 0x3
int 0x10
mov dl, 0x00
mov ah, 0x2
int 0x10
jmp p_Return
p_Done:
pop cx
pop bx
pop ax
ret
These are the lines I want to display:
db "Kernel successfully loaded!", 10, 0
db 9, "Lmao, just a tab test!", 10, 0
This is the output I get when my kernel runs (the _
is the cursor):
Kernel successfully loaded!
_
It successfully prints the first line, but hangs while printing the second. If I remove the PUSH and POP statements it works just fine. Why does my kernel hang when I attempt save and restore registers in my print
function? Where should I place my stack and where should I load my kernel?
The kernel stack is part of the kernel space. Hence, it is not directly accessible from a user process. Whenever a user process uses a syscall, the CPU mode switches to kernel mode. During the syscall, the kernel stack of the running process is used. The size of the kernel stack is configured during compilation and remains fixed.
Problems in the user stack can’t cause a crash in the kernel. This isolation makes the kernel more secure because it only trusts the stack area that is under its control. However, since the stack grows with deeply nested calls, we need to be cautious about the space complexity of algorithms.
The hardware pushes the PC, SP, and PS on to the stack, loads the SP with the address of the kernel mode stack and the PC from the interrupt handler (from the processor's dispatch table).
With separate user and kernel stacks for each process or thread, we have better isolation. Problems in the user stack can’t cause a crash in the kernel. This isolation makes the kernel more secure because it only trusts the stack area that is under its control.
It doesn't help that this isn't a Minimal Complete Verifiable Example but your question suggests possible things to look for. Usually if code works by removing PUSHes and POPs in function prologue and epilogue, it usually means the stack was becoming unbalanced during the execution of the body of the function. An unbalanced stack will cause the RET instruction to return to whatever semi-random location is on the top of the stack. This will likely lead to apparent hangs and/or reboots. The behaviour will be undefined.
I haven't followed the logic in your code, but this stands out:
print:
push ax
push bx
push cx
... snip out code for brevity
jmp print
At some point it is possible for your print
function to be restarted at a point before all the pushes. This will cause more PUSHes onto the stack without corresponding POPs at the end. I think you might have been trying to get behavior like this:
print:
push ax
push bx
push cx
.prloop:
... snip out code for brevity
jmp .prloop
The .prloop
label appears at the top of the function but after the pushes. This prevents excess values being placed on the stack. .prloop
can be any valid label of your choice.
The stack can be placed anywhere in memory that isn't being used by the system and doesn't interfere with your bootloader and/or kernel code. As @RossRidge points out, using an SP of 0xFFFF misaligns the stack because it is an odd address (0xFFFF=-1). The x86 won't complain (absent the Alignment Check flag) but it can hurt stack performance on some x86 architectures.
Note: setting SS:SP to 0x1000:0x0000 will cause the stack to run from 0x1000:0xFFFF down to 0x1000:0x0000. The first 16-bit value pushed will be at 0x1000:0xFFFE.
Your kernel and stack are generally safe anywhere between physical address 0x00520 and 0x90000 as long as they don't conflict with one another. On some systems the upper part of the memory region between 0x90000 and 0xA0000 may not be available. If you want to use this memory area I would avoid the area between 0x9C000 and 0xA0000. This area can be used by the BIOS as part of the Extended BIOS Data Area (EBDA).
The exact amount of usable Low Memory Area (LMA) space can be learned by calling the ROM-BIOS's interrupt 12h service, or directly reading the word at 0x00413. In either case, the result is the amount of KiB of usable memory. If there is less than 640 KiB of actual memory, and/or some of the memory at the LMA's top is used by the EBDA or other software, then the result will be lower than 640 (that is, 0x0280). Technically the result can be higher than 640 too. By multiplying or left-shifting the amount in KiB, the equivalent amount in paragraphs or bytes can be calculated.
The region between 0x00000 and 0x00520 should not be used as it contains the real mode interrupt vector table, the BIOS Data Area (BDA) and 32 bytes of memory that is considered to be reserved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With