I'm currently reading the boot.s
file in the source for the first ever Linux kernel (assuming that 0.01 is indeed the first public release).
I know C and ASM, the latter considerably less than the former. Even so, I seem to be able to understand and essentially grasp the code in the source files.
This file has me confused though. I now realise that's because it's in real mode, not protected mode. Needless to say, I've never seen ASM code written in real mode before. Protected mode was the defacto mode x86 OSes ran on before I was even born, so it's to be expected.
Here's a routine I want to comprehend better:
/*
* This procedure turns off the floppy drive motor, so
* that we enter the kernel in a known state, and
* don't have to worry about it later.
*/
kill_motor:
push dx
mov dx,#0x3f2
mov al,#0
outb
pop dx
ret
Looking up outb
, I find it's used to pass bytes to ports on the computer. I'll hazard a guess based on C documentation that this scenario passes the 'stop motor' byte as the first argument, and as the floppy drive port number as the second.
Is this interface provided by the BIOS? Or by the floppy drive directly? I'm assuming the BIOS has frugal 'drivers' for very basic operation of all the fundamental devices.
Here's where I'm stumped: it seems that numbers like #0x3f2
are being pulled out of thin air. They're clearly hardware port numbers or something. This file is sprinkled with such numbers, with no explanation what they're referring to. Where can I find a comprehensive reference that shows all the hardware ports and control numbers they can receive from real mode? Also, it seems the file moves the kernel around in memory throughout the booting processes, with hard-coded memory addresses. Where can I find a guide for what memory address ranges are available to write over during real mode?
I also read a comment by Linus about reprogramming interrupts to avoid a collision between the BIOS and internal hardware interrupts. I'm not going to lie, that went right over my head.
Help would be great; Google seems sparse on the topic, in case you're wondering.
Real mode is characterized by a 20-bit segmented memory address space (giving 1 MB of addressable memory) and unlimited direct software access to all addressable memory, I/O addresses and peripheral hardware. Real mode provides no support for memory protection, multitasking, or code privilege levels.
Real mode is program operation in which an instruction can address any space within the 1 megabyte of RAM. Typically, a program running in real mode is one that needs to get to and use or update system data and can be trusted to know how to do this.
Entering Protected ModeCreate a Valid GDT (Global Descriptor Table) Create a 6 byte pseudo-descriptor to point to the GDT. If paging is going to be used, load CR3 with a valid page table, PDBR, or PML4. If PAE (Physical Address Extension) is going to be used, set CR4.
Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. Real mode is characterized by a 20-bit segmented memory address space and unlimited direct software access to all memory, I/O addresses and peripheral hardware.
These addresses were cast in stone 30 years ago when IBM released the first IBM PC. 0x3f0 is the first address for the primary floppy disk controller registers. A list of addresses is available here.
One uncharacteristic move by the IBM design team was that they put the machine together from standard off-the-shelf parts. Most chips came from Intel, the floppy disk controller was a NEC design. Unintentionally ensuring that everybody could build a clone. Those clones used the same addresses to ensure software compatibility, turning the IBM choice into an industry standard that could be hard-coded.
Firstly, welcome to the world of realmode assembler! You've probably already realised that the actual assembler is much the same between realmode and protected mode - the primary differences being operand sizes and memory layout/management.
There are some resources for realmode out there on the internet - you've just got to hunt them down! One very important resource is Ralf Brown's Interrupt List (known as RBIL) - it provides a lot of information about the various interrupts used in realmode programming. Another is BiosCentral's CMOS memory map which describes what information the BIOS stores (or should store) in various memory locations.
To answer some of your questions on the Linux code you posted:
outb
is the instruction to write the byte in al
to port dx
- 0x3f2 is the floppy controller port. Wikipedia can help you out with the basic list of x86 port numbers, but you'll have to dig around for detailed information about the actual format of the al
bits.
what memory address ranges are available to write over during realmode?
You should do some research into INT 15h, AX=E820h - it returns a memory map describing which memory areas can be used and which are reserved. Take note though: when looking at interrupts, it's important to see how "new" they are because older BIOSes may not support them.
...reprogramming interrupts to avoid a collision between the BIOS and internal hardware interrupts
Many hardware devices have programmable interrupts (which are used to service the hardware when it needs attention). Typically the BIOS will sort out an initial assignment during its startup routines, but it's not unusual for OS's to rejig the hardware interrupts for its own purposes or to prevent known incompatibilities.
One last note: it seems that numbers like #0x3f2 are being pulled out of thin air
. The answer is yes. A lot of the Linux boot source is horrific (yes, it's just my opinion) and does appear to sprinkle seemingly random address, port numbers and other bits without any meaningful explanation. Stick with it, look up other realmode resources and eventually it will make sense. Oh, and if you come across a comprehensive reference - tell EVERYONE (because it doesn't currently exist).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With