I'm trying to write a bootloader. I would like to compile some C code so the bootloader can load it into memory and jump there.
I have two questions:
You can create plain binaries with gcc linker by using a linker script. the key is the OUTPUT_FORMAT(binary) directive:
//========================================
FILE: linker.ld
//========================================
OUTPUT_FORMAT(binary)
SECTIONS {
.text : { *(.text) }
.data : { *(.data) }
.bss : { *(.bss) }
}
//========================================
I invoked the linker in the makefile as follows (whereas linker.ld is the linker script file):
//========================================
ld -T linker.ld loaderEntry.o loaderMain.o -o EOSLOAD.BIN -L$(lib) -lsys16
//========================================
I have compiled the object code with
//========================================
gcc -nostdinc -nostdlib -ffreestanding -c <code files> -o theObjectCode.o
//========================================
in order to get rid of standard libraries which do not work in 16 bit mode.
for the handshake MBR loader and the bootloader I used the following loaderMain.S gcc assembly code (loaderMain.o has to be the first file passed to the linker to be located at address offset 0x0000 as you can see above). I used the -code16gcc directive in order to generate 16bit code. However, the generated code will probably not work on old x86 machines as I used incompatible code instructions (%esp, $ebp, leave, etc.) which are only available for newer models.
//========================================
FILE: loaderEntry.S
//========================================
.text
.code16gcc
// the entry point at 0x9000:0x0000 // this is where I did a far call to by the MBR
.globl loaderMain // loader C entry function name declaration
push %cs // initialize data segments with same value as code segment
pop %ax // (I managed only tiny model so far ...)
movw %ax, %ds
movw %ax, %es
movw %ax, %fs
movw %ax, %gs
movw %ax, %ss // initialize stack segment with same value as code segment
movl $0xffff, %esp // initialize stack pointers with 0xffff (usage of extended (dword) offsets does not work, so we're stuck in tiny model)
movl %esp, %ebp
call loaderMain // call C entry function
cli // halt the machine for the case the main function dares to return
hlt
//========================================
the assembly code calls a symbol which has been defined in a C language file loaderMain.c. in order to generate 16bit mode compatible code you have to declare the usage of 16 bit instruction set before the first line of code in every C file you use. This can only be done by an inline assembly instruction AFAIK:
asm(".code16gcc\n"); // use 16bit real mode code set
/* ... some C code .. */
// ... and here is the C entry code ... //
void loaderMain() {
uint cmdlen = 0;
bool terminate = false;
print(NL);
print(NL);
print("*** EOS LOADER has taken over control. ***\r\n\r\n");
print("Enter commands on the command line below.\r\n");
print("Command are executed by pressing the <ENTER> key.\r\n");
print("The command \'help\' shows a list of all EOS LOADER commands.\r\n");
print("HAVE FUN!\r\n");
print(NL);
while (!terminate) {
print("EOS:>");
cmdlen = readLine();
buffer[cmdlen] = '\0';
print(NL);
terminate = command();
}
shutdown();
}
Up until now I only managed to write plain C code - I didn't succeed with C++ code so far, and I only managed to produce tiny memory model (meaning CS, SS, DS and ES are all the same). gcc uses only offsets as pointer addresses, so it seems to be hard to overcome the timny memory model issue without additional assembler code. (Although I have heard of that some people have managed that problem in gcc)
The calling convention is that the last argument is pushed first on stack, and it seems that all values are dword aligned. An example of assembly code which can be called in .code16gcc C code is posted below:
//======================
.text
.code16gcc
.globl kbdread // declares a global symbol so that the function can be called from C
.type kbdread, @function // declares the symbol as a function
kbdread: // the entry point label which has to the same as the symbol
// this is the conventional stack frame for function entry
pushl %ebp
movl %esp, %ebp
// memory space for local variables would be allocated by decrementing the stack pointer accordingly
// the parameter arguments are being addressed by the base pointer which points to the same address while bein within the function
pushw %ds // I'm paranoid, I know...
pushw %es
pushw %fs
pushl %eax
pushl %ebx
pushl %ecx
pushl %edx
pushl %esi
pushl %edi
xorl %eax, %eax // calls the keyboard interrupt in order to read char code and scan code
int $0x16
xorl %edi, %edi
movl 8(%ebp), %edi // moves the pointer to the memory location in which the char code will be stored into EDI
movb %al, (%edi) // moves the char code from AL to the memory location to which EDI points
xorl %edi, %edi // paranoid again (but who knows how well the bios handles extended registers??)..
movl 12(%ebp), %edi // moves the pointer to the memory location in which the scan code will be stored into EDI
movb %ah, (%edi) // moves the scan code from AH to the memory location to which EDI points
popl %edi // restoring the values from stack..
popl %esi
popl %edx
popl %ecx
popl %ebx
popl %eax
popw %fs
popw %es
popw %ds
leave // .. and the conventional end frame for functions.
ret // be aware that you are responsible to restore the stack when you have declared local variables on the stack ponter.
// the leave instruction is a convenience method to do that. but it is part of not early X86 instruction set (as well as extended registers)
// so be careful which instruftion you actually use if you have to stay compatible with older computer models.
//=====================
btw the C header declaration of the function looks like:
//=====================
void kbdread(char* pc, (unsigned char)* psc);
//=====================
Hope this was helpful somehow. Cheers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With