Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I JMP to relocated code in my MBR?

I'm trying to write an extremely simple MBR to start learning how to write an MBR/Kernel. This is what I have so far (created from pieces of other MBRs). The binary I get from using nasm and then ld to link is a bit different from just using nasm for both, but that doesn't appear to be the problem.

I first started with jmp 0:continue but that appears to jump to 0000:7c22 (or 001d with nasm alone... i believe i didnt specify that it starts at 7c00) but im looking to jump to :7a22 or :7a1d, the address of the relocated code. I tried using just jmp continue and then as seen uncommented below, adding the stack pointer to the continue pointer, pushing it and ret. All I get is a blinking cursor when dd'ed to my 1st sector. Any help is appreciated.

                            ; nasm+ld       nasm            comment
global _start
_start:
    xor    cx, cx           ; 6631c9        31c9            Set segment registers to zero
    mov    es, cx           ; 8ec1          8ec1
    mov    ds, cx           ; 8ed9          8ed9
    mov    ss, cx           ; 8ed1          8ed1
    mov    sp, 0x7A00       ; 66bc007a      bc007a          Stack
    mov    di, sp           ; 6689e7        89e7            Bottom of relocation point
    mov    esi, _start      ; be007c0000    66be00000000
    cld                     ; fc            fc
    mov    ch, 1            ; b501          b501            cx = 256
    rep movsw               ; f366a5        f3a5            Copy self to 0:7A00

;----------------------------------------------------------------------------------------------------------------------
    xor    eax,eax
    mov    ax, sp
    add    ax, continue

    ;jmp    0:continue      ; ea227c00000000    ea1d000000      near JMP to copy of self
                            ; or
    ;jmp    continue        ; (eb00)
    push eax
    ret
;----------------------------------------------------------------------------------------------------------------------

continue:
    sti                     ; fb            fb

ERROR:
    mov esi, errormsg       ; be3b7c0000 (be36) 66be36000000        Error Message loc
    mov ah, 0x0E            ; b40e          b40e
    mov bx, 7               ; 66bb          bb0700
disp:
    lodsb                   ; ac            ac
    cmp ah, 0x00            ; 80fc00        80fc00
    je end                  ; 7404          7404
    int 10h                 ; cd10          cd10
    jmp disp                ; ebf6          ebf6

end:
    nop                     ; 90            90
    jmp end                 ; ebfd          ebfd            infinte loop

errormsg db 10,'YOU MESSED UP.',13,0

times (0x1b8 - ($-$$)) nop  ; 90            90          Padding

UID db 0xf5,0xbf,0x0f,0x18                                         ;Unique Disk ID

BLANK times 2 db 0

PT1 db 0x80,0x20,0x21,0x00,0x0C,0x50,0x7F,0x01,0x00,0x08,0x00,0x00,0xb0,0x43,0xF9,0x0D ;First Partition Entry
PT2 times 16 db 0                                      ;Second Partition Entry
PT3 times 16 db 0                                      ;Third Partition Entry
PT4 times 16 db 0                                      ;Fourth Partition Entry

BOOTSIG dw 0xAA55                                      ;Boot Signature[/code]
like image 204
Wyllow Wulf Avatar asked Dec 10 '22 16:12

Wyllow Wulf


2 Answers

As you have discovered you can set the origin point to ORG 0x7A00 for your entire bootloader. That works perfectly well. The code that copies the boot sector to 0x7A00 doesn't rely on any labels that are absolute, just relatives ones. This answer is more of a thought experiment and a different way of approaching it.

What would happen if we wanted to display a string before the copy as an example? What are some possible options?

  1. NASM allows for the BIN format (-f bin) to have sections that take on a virtual starting point (origin point) and a physical address (start). This method is too restrictive for how a bootloader is laid out.
  2. Use an LD linker script to define the layout of the bootloader.
  3. Reorganize code to use an ORG (origin point) of 0x0000 and set the segment registers accordingly. See my other answer to this question.

This answer focuses on option 2. Explaining how LD Linker scripts work is too broad for Stackoverflow. The LD manual is the best source of information, and it does have examples. The idea is that we allow the bootloader to be laid out inside the linker script. We can set up LMA (Load Memory Address) to specify the memory address where the section will be loaded into memory. The VMA is the origin point for a section. All labels and addresses within a section will be resolved relative to its VMA.

Conveniently we can use a section with a specific LMA to place the boot signature directly into the output file, rather than specify it in the assembly code. We can also define symbols from the linker script that can be accessed from the assembly code using the NASM extern directive.

One advantage to all this is that you can define sections in your assembly code in any order you want and the linker script will reorder things. You can also link together multiple object files. The object file containing the boot code you want to appear first should be listed first.

The layout of this linker script roughly looks like this:

Non-relocatable portion of boot code (boot.text) Relative to an origin of 0x7c00
Non-relocatable portion of boot data (boot.data)
--------------------------------------- Word aligned
Relocatable portion of boot code (rel.text) - Relative to an origin of 0x7a00
Relocatable portion of boot data (rel.data)
Relocatable portion of partition data at offset 0x1b8 (partition.data)
---------------------------------------
Boot signature at offset 0x1fe

A linker script that would layout this boot loader could look something like:

ENTRY(_start);
OUTPUT(elf_i386);

SECTIONS
{
    /* Set the base of the main bootloader offsets */
    _bootbase = 0x7c00; /* Where bootloader initially is loaded in memory */
    _relbase  = 0x7a00; /* Address entire bootsector will be copied to
                           This linker script expects it to be word aligned */
    _partoffset = 0x1b8; /* Offset of UID and Partition data */
    _sigoffset  = 0x1fe; /* Offset of the boot signature word */


    /* SUBALIGN(n) in an output section will override the alignment
     * of any input section that is encontered */

    /* This is the boot loader code and data that is expected to run from 0x7c00 */
    .bootinit _bootbase : SUBALIGN(2)
    {
        *(boot.text);
        *(boot.data);
    }

    /* Note that referencing any data in the partition table will
     * only be usable from the code that is in the .bootrel section */

    /* Partition data */
    .partdata _relbase + _partoffset :
        AT(_bootbase + _partoffset) SUBALIGN(0)
    {
        *(partition.data);
    }

    /* Boot signature */
    .bootsig :
        AT(_bootbase + _sigoffset) SUBALIGN(0)
    {
        SHORT(0xaa55);
    }
    /* Length of region to copy in 16-bit words */
    _rel_length = 256;
    /* Address to copy to */
    _rel_start = _relbase; /* Word aligned start address */

    /* Code and data that will expect to run once relocated
     * is placed in this section. Aligned to word boundary.
     * This relocateable code and data will be placed right
     * after the .bootinit section in the output file */
    .bootrel _relbase + SIZEOF(.bootinit) :
        AT(_bootbase + SIZEOF(.bootinit)) SUBALIGN(2)
    {
        *(rel.text);
        *(rel.data);
    }
}

A revised copy of your code using this linker script and the symbols defined in it could look like:

BITS 16

extern _bootbase
extern _relbase
extern _rel_length
extern _rel_start

section boot.text
                            ; comment
global _start
_start:
    xor    cx, cx           ; Set segment registers to zero
    mov    es, cx
    mov    ds, cx
    mov    ss, cx
    mov    sp, 0x7A00       ; Stack
    cld

.copymsg:
    mov si, copymsg         ; Copy message
    mov ah, 0x0E            ; 0E TTY Output
    mov bx, 7               ; Page number
.dispcopy:
    lodsb                   ; Load next char
    test al, al             ; Compare to zero
    jz .end                 ; If so, end
    int 10h                 ; Display char
    jmp .dispcopy           ; Loop
.end:
    mov    di, _rel_start   ; Beginning of relocation point
    mov    si, _bootbase    ; Original location to copy from
    mov    cx, _rel_length  ; CX = words to copy
    rep movsw               ; Copy self to destination

    jmp    0:rel_entry      ; far JMP to copy of self

section rel.text
rel_entry:
    sti                     ; Enable interrupts

    mov si, successmsg      ; Error Message location
    mov ah, 0x0E            ; 0E TTY Output
    mov bx, 7               ; Page number
.disp:
    lodsb                   ; Load next char
    test al, al             ; Compare to zero
    je .end                 ; If so, end
    int 10h                 ; Display char
    jmp .disp               ; Loop

    cli                     ; Disable interrupts
.end:
    hlt                     ; CPU hlt
    jmp .end                ; infinte loop

section rel.data
successmsg db 10,'Success!',13,0

section boot.data
copymsg db 10,'Before copy!',13,0

section partition.data
UID db 0xf5,0xbf,0x0f,0x18  ;Unique Disk ID

BLANK times 2 db 0

PT1 db 0x80,0x20,0x21,0x00,0x0C,0x50,0x7F,0x01
    db 0x00,0x08,0x00,0x00,0xb0,0x43,0xF9,0x0D
PT2 times 16 db 0
PT3 times 16 db 0
PT4 times 16 db 0

As an experiment to make sure that the code in the boot.text section could access the data in the boot.data I display a string before the copy. I then do a FAR JMP to the relocated code. The relocated code displays a success string.

I modified the code to not use the 32-bit registers like ESI since you will be executing this code in real mode. I also amended your infinite loop to use the HLT instruction.

The code and linker script could be modified to only copy from the start of the relocated data up to the 512th byte, but is beyond the scope of this answer.


A Look at the Disassembly

The .bootinit section that has an origin point of 0x7c00 is provided below. This is an OBJDUMP snippet of that section (without the data for brevity):

Disassembly of section .bootinit:

00007c00 <_start>:
    7c00:       31 c9                   xor    cx,cx
    7c02:       8e c1                   mov    es,cx
    7c04:       8e d9                   mov    ds,cx
    7c06:       8e d1                   mov    ss,cx
    7c08:       bc 00 7a                mov    sp,0x7a00
    7c0b:       fc                      cld

00007c0c <_start.copymsg>:
    7c0c:       be 2e 7c                mov    si,0x7c2e
    7c0f:       b4 0e                   mov    ah,0xe
    7c11:       bb 07 00                mov    bx,0x7

00007c14 <_start.dispcopy>:
    7c14:       ac                      lods   al,BYTE PTR ds:[si]
    7c15:       84 c0                   test   al,al
    7c17:       74 04                   je     7c1d <_start.end>
    7c19:       cd 10                   int    0x10
    7c1b:       eb f7                   jmp    7c14 <_start.dispcopy>

00007c1d <_start.end>:
    7c1d:       bf 00 7a                mov    di,0x7a00
    7c20:       be 00 7c                mov    si,0x7c00
    7c23:       b9 00 01                mov    cx,0x100
    7c26:       f3 a5                   rep movs WORD PTR es:[di],WORD PTR ds:[si]
    7c28:       ea 3e 7a 00 00          jmp    0x0:0x7a3e

All the VMA addresses on the left column appear to be properly set relative to the origin point 0x7c00. The FAR JUMP (jmp 0x0:0x7a3e) also jumped to the location where everything was relocated (copied). A similar abbreviated dump of the .bootrel section appears as:

Disassembly of section .bootrel:

00007a3d <rel_entry-0x1>:
        ...

00007a3e <rel_entry>:
    7a3e:       fb                      sti
    7a3f:       be 54 7a                mov    si,0x7a54
    7a42:       b4 0e                   mov    ah,0xe
    7a44:       bb 07 00                mov    bx,0x7

00007a47 <rel_entry.disp>:
    7a47:       ac                      lods   al,BYTE PTR ds:[si]
    7a48:       3c 00                   cmp    al,0x0
    7a4a:       74 05                   je     7a51 <rel_entry.end>
    7a4c:       cd 10                   int    0x10
    7a4e:       eb f7                   jmp    7a47 <rel_entry.disp>
    7a50:       fa                      cli

00007a51 <rel_entry.end>:
    7a51:       f4                      hlt
    7a52:       eb fd                   jmp    7a51 <rel_entry.end>

The VMA in the left column is relative to the beginning of 0x7A00 which is correct. The instruction mov si,0x7a54 is an absolute near memory address and it is properly encoded to reference the successmsg address (I snipped the data out for brevity so it doesn't appear).

The entries:

00007a3d <rel_entry-0x1>:
        ...

Are information related to aligning the .bootrel section to an even word boundary. With this linker script rel_entry will always have an even address.


Compiling and Linking this Bootloader

The easiest way is to use these commands:

nasm -f elf32 -o boot.o boot.asm
ld -melf_i386 -Tlinker.ld -o boot.bin --oformat=binary boot.o

It should be pointed out that we are using ELF32 format with NASM, not BIN. LD is then used to create the binary file boot.bin which should be a 512 byte image of the boot sector. linker.ld is the name of the linker script file.

If you want the convenience of being able to get an object dump then you can use these commands to assemble and link:

nasm -f elf32 -o boot.o boot.asm
ld -melf_i386 -Tlinker.ld -o boot.elf boot.o
objcopy -O binary boot.elf boot.bin

The difference from the first method is that we don't use --oformat=binary option with LD. The result will be that an ELF32 image will be generated and placed in the output file boot.elf. We can't use boot.elf directly as our boot image, so we use OBJCOPY to convert the ELF32 file to a binary file called boot.bin. The usefulness of doing it this way can be seen if we use a command like this to dump the contents and disassembly of the ELF file:

objdump boot.elf -Mintel -mi8086 -Dx
  • -D option is disassemble all
  • -x output the headers
  • -mi8086 disassemble as 16-bit 8086 code
  • -Mintel disassembly should be INTEL syntax rather than default ATT syntax
like image 90
Michael Petch Avatar answered Dec 28 '22 09:12

Michael Petch


Compiled and linked using: nasm -f bin -o mbr.bin mbr.asm

[BITS 16]
ORG 0x00007a00
                            ; opcodes       comment
global _start
_start:
    xor    cx, cx           ; 31c9          Set segment registers to zero
    mov    es, cx           ; 8ec1
    mov    ds, cx           ; 8ed9
    mov    ss, cx           ; 8ed1
    mov    sp, 0x7A00       ; bc007a        Stack
    mov    di, sp           ; 89e7          Bottom of relocation point
    mov    esi, 0x00007C00  ; 66be007c0000  Original location
    cld                     ; fc
    mov    ch, 1            ; b501          CX = 256
    rep movsw               ; f3a5          Copy self to 0:7A00
    jmp    0:continue       ; ea1d7a0000    near JMP to copy of self

continue:
    sti                     ; fb

ERROR:
    mov esi, errormsg       ; 66be357a0000  Error Message location
    mov ah, 0x0E            ; b40e          0E TTY Output
    mov bx, 7               ; bb0700        Page number
disp:
    lodsb                   ; ac            Load next char
    cmp al, 0x00            ; 3c00          Compare to zero
    je end                  ; 7404          If so, end
    int 10h                 ; cd10          Display char
    jmp disp                ; ebf6          Loop

end:
    nop                     ; 90            Do Nothing
    jmp end                 ; ebfd          infinte loop

errormsg db 10,'YOU MESSED UP!',13,0

times (0x1b8 - ($-$$)) nop  ; 90909090...   Padding

UID db 0xf5,0xbf,0x0f,0x18  ;Unique Disk ID

BLANK times 2 db 0

PT1 db 0x80,0x20,0x21,0x00,0x0C,0x50,0x7F,0x01
PT1more db 0x00,0x08,0x00,0x00,0xb0,0x43,0xF9,0x0D
PT2 times 16 db 0
PT3 times 16 db 0
PT4 times 16 db 0

BOOTSIG dw 0xAA55           ;Boot Signature

Output of hexdump -C mbr.bin:

00000000  31 c9 8e c1 8e d9 8e d1  bc 00 7a 89 e7 66 be 00  |1.........z..f..|
00000010  7c 00 00 fc b5 01 f3 a5  ea 1d 7a 00 00 fb 66 be  ||.........z...f.|
00000020  35 7a 00 00 b4 0e bb 07  00 ac 3c 00 74 04 cd 10  |5z........<.t...|
00000030  eb f7 90 eb fd 0a 59 4f  55 20 4d 45 53 53 45 44  |......YOU MESSED|
00000040  20 55 50 21 0d 00 90 90  90 90 90 90 90 90 90 90  | UP!............|
00000050  90 90 90 90 90 90 90 90  90 90 90 90 90 90 90 90  |................|
*
000001b0  90 90 90 90 90 90 90 90  f5 bf 0f 18 00 00 80 20  |............... |
000001c0  21 00 0c 50 7f 01 00 08  00 00 b0 43 f9 0d 00 00  |!..P.......C....|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200
like image 24
Wyllow Wulf Avatar answered Dec 28 '22 09:12

Wyllow Wulf