Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling C code from a bootloader

I'm trying to write a bootloader. I would like to compile some C code so the bootloader can load it into memory and jump there.

I have two questions:

  • Is the calling convention the same as on x86? Namely, arguments on stack, left to right.
  • How do I produce raw binary with gcc?
like image 770
Guido Avatar asked Dec 13 '22 04:12

Guido


1 Answers

You can create plain binaries with gcc linker by using a linker script. the key is the OUTPUT_FORMAT(binary) directive:

//========================================
FILE: linker.ld
//========================================
OUTPUT_FORMAT(binary)
SECTIONS {
    .text : { *(.text) }
    .data : { *(.data) }
    .bss  : { *(.bss)  }
}
//========================================

I invoked the linker in the makefile as follows (whereas linker.ld is the linker script file):

//========================================
ld -T linker.ld loaderEntry.o loaderMain.o -o EOSLOAD.BIN -L$(lib) -lsys16
//========================================

I have compiled the object code with

//========================================
gcc -nostdinc -nostdlib -ffreestanding -c <code files> -o theObjectCode.o
//========================================

in order to get rid of standard libraries which do not work in 16 bit mode.

for the handshake MBR loader and the bootloader I used the following loaderMain.S gcc assembly code (loaderMain.o has to be the first file passed to the linker to be located at address offset 0x0000 as you can see above). I used the -code16gcc directive in order to generate 16bit code. However, the generated code will probably not work on old x86 machines as I used incompatible code instructions (%esp, $ebp, leave, etc.) which are only available for newer models.

//========================================
FILE: loaderEntry.S
//========================================
  .text
  .code16gcc

  // the entry point at 0x9000:0x0000 // this is where I did a far call to by the MBR
  .globl loaderMain   // loader C entry function name declaration
  push  %cs           // initialize data segments   with same value as code segment
  pop   %ax           // (I managed only tiny model so far ...)
  movw  %ax, %ds
  movw  %ax, %es
  movw  %ax, %fs
  movw  %ax, %gs
  movw  %ax, %ss      // initialize stack segment with same value as     code segment
  movl  $0xffff, %esp // initialize stack pointers with 0xffff (usage of extended (dword) offsets does not work, so we're stuck in tiny model)
  movl  %esp, %ebp
  call  loaderMain   // call C entry function

  cli // halt the machine for the case the main function dares to return
  hlt
//========================================

the assembly code calls a symbol which has been defined in a C language file loaderMain.c. in order to generate 16bit mode compatible code you have to declare the usage of 16 bit instruction set before the first line of code in every C file you use. This can only be done by an inline assembly instruction AFAIK:

  asm(".code16gcc\n"); // use 16bit real mode code set

  /*  ... some C code .. */


  // ... and here is the C entry code ... //
  void loaderMain() {
    uint cmdlen = 0;
    bool terminate = false;
    print(NL);
    print(NL);
    print("*** EOS LOADER has taken over control. ***\r\n\r\n");
    print("Enter commands on the command line below.\r\n");
    print("Command are executed by pressing the <ENTER> key.\r\n");
    print("The command \'help\' shows a list of all EOS LOADER commands.\r\n");
    print("HAVE FUN!\r\n");
    print(NL);
    while (!terminate) {
        print("EOS:>");
        cmdlen = readLine();
        buffer[cmdlen] = '\0';
        print(NL);
        terminate = command();
    }
    shutdown();
  }

Up until now I only managed to write plain C code - I didn't succeed with C++ code so far, and I only managed to produce tiny memory model (meaning CS, SS, DS and ES are all the same). gcc uses only offsets as pointer addresses, so it seems to be hard to overcome the timny memory model issue without additional assembler code. (Although I have heard of that some people have managed that problem in gcc)

The calling convention is that the last argument is pushed first on stack, and it seems that all values are dword aligned. An example of assembly code which can be called in .code16gcc C code is posted below:

//======================
  .text
  .code16gcc

  .globl kbdread             // declares a global symbol so that the function can be called from C
  .type  kbdread, @function  // declares the symbol as a function
kbdread:                     // the entry point label which has to the same as the symbol

  // this is the conventional stack frame for function entry
  pushl %ebp
  movl  %esp, %ebp

  // memory space for local variables would be allocated by decrementing the stack pointer accordingly
  // the parameter arguments are being addressed by the base pointer which points to the same address while bein within the function

  pushw %ds  // I'm paranoid, I know...
  pushw %es
  pushw %fs
  pushl %eax
  pushl %ebx
  pushl %ecx
  pushl %edx
  pushl %esi
  pushl %edi

  xorl %eax, %eax  // calls the keyboard interrupt in order to read char code and scan code
  int  $0x16

  xorl %edi, %edi
  movl 8(%ebp), %edi // moves the pointer to the memory location in which the char code will be stored into EDI             
  movb %al, (%edi)   // moves the char code from AL to the memory location to which EDI points

  xorl %edi, %edi // paranoid again (but who knows how well the bios handles extended registers??)..

  movl 12(%ebp), %edi // moves the pointer to the memory location in which the scan code will be stored into EDI
  movb %ah, (%edi)    // moves the scan code from AH to the memory location to which EDI points

  popl %edi // restoring the values from stack..
  popl %esi
  popl %edx
  popl %ecx
  popl %ebx
  popl %eax
  popw %fs
  popw %es
  popw %ds

  leave  // .. and the conventional end frame for functions.
  ret    // be aware that you are responsible to restore the stack when you have declared local variables on the stack ponter.
         // the leave instruction is a convenience method to do that. but it is part of not early X86 instruction set (as well as extended registers)
         // so be careful which instruftion you actually use if you have to stay compatible with older computer models.
//=====================

btw the C header declaration of the function looks like:

//=====================
void kbdread(char* pc, (unsigned char)* psc);
//=====================

Hope this was helpful somehow. Cheers.

like image 169
phil Avatar answered Dec 23 '22 09:12

phil