Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct calling convention to use within a bootloader?

I am trying writing a bootloader and an extremely primitive and basic kernel to learn about bare metal coding practices and techniques. Anyways, I am writing my bootloader using NASM. My code is working, but I have a question about the calling conventions to use.

I am compiling my bootloader simply by running NASM: nasm bootloader.asm -o bootloader.

In my assembly code, I have written functions like BlDisplayString to display strings via BIOS interrupt int 0x10 with AH = 0x13. I am trying to emulate the __fastcall calling convention by passing the parameters in CX, DX, STACK. Is this the correct calling convention standard to use in 16 bit code? The CPU isn't in protected mode and is still in real mode when I'm calling these functions.

like image 481
Arush Agarampur Avatar asked Jul 05 '20 05:07

Arush Agarampur


2 Answers

The CPU doesn't care, do whatever is convenient and maintainable. The only judge of "correctness" is you, if you're not trying to link to any code generated by a C compiler.

But yes, register args are usually a good idea, with call-clobbered AX, CX, DX. Letting ES be call-clobbered might be convenient to avoid having functions save/restore it, if you're willing to set it before every rep-string function.

Passing args in registers that line up with where BIOS int calls want them can maybe save some instructions in wrapper code.

You can even use a custom calling convention on a per-function basis, but that's harder to remember / document. Useful for local helper functions that are only called from one function (but multiple places in that function), or from a couple similar functions in one file. In comments, document which registers for input, output, and clobbered (used as scratch space without save/restore).

Having a couple different calling conventions for different kinds of functions is the middle ground between 1 fixed convention vs. a different one for every function.

Returning boolean conditions in FLAGS is convenient for asm, especially if you expect your caller to branch on it. Or for a function like memcmp, ending with cmp al, dl or whatever lets your caller branch on equality, or on greater / less, whichever FLAGS it wants to read. All of this without the cost of actually generating a + / 0 / - return value like the C function.

An answer on CodeGolf.SE Tips for golfing in x86/x64 machine code goes into more details about what you might do if you're going all out for small code without caring at all about maintainability or consistency between functions.

If you want to fit more code into a 512-byte first-stage bootloader, or into fewer extra sectors, you can often save some bytes without hurting readability. Fewer instructions is generally easier to read. (That's not always the same thing as smaller machine-code size, though.)

like image 112
Peter Cordes Avatar answered Oct 05 '22 11:10

Peter Cordes


In my boot loaders the calling conventions are all made up of individual protocols, each specifically tailored to the function in question. This is needed to save as many bytes as possible, and to cram in a lot of features. Each function has a protocol comment which specifies inputs, outputs, and changed registers.

I will look at the FAT32 loader primarily, because it actually has several functions in the conventional sense. These are read_sector, clust_to_first_sector, clust_next, and check_clust. I will quote two of these comments next.

Here's read_sector:

                ; Read a sector using Int13.02 or Int13.42
                ;
                ; INP:  dx:ax = sector number within partition
                ;       bx => buffer
                ;       (_LBA) ds = ss
                ; OUT:  If unable to read,
                ;        ! jumps to error instead of returning
                ;       If sector has been read,
                ;        dx:ax = next sector number (has been incremented)
                ;        bx => next buffer (bx = es+word[para_per_sector])
                ;        es = input bx
                ; CHG:  -
read_sector:

And this is clust_to_first_sector:

                ; INP:  dx:ax = cluster - 2 (0-based cluster)
                ; OUT:  cx:bx = input dx:ax
                ;       dx:ax = first sector of that cluster
                ; CHG:  -
clust_to_first_sector:

There's another protocol comment for the FSIBOOT entrypoint but this is not exactly a function. There's also the simpler FAT12/FAT16 loader, but this one only has one actual function, read_sector very much like the one I quoted, and one part where a function call was completely inlined into a loop.

like image 41
ecm Avatar answered Oct 05 '22 13:10

ecm