Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Near call/jump tables don't always work in a bootloader

General Problem

I've been developing a simple bootloader and have stumbled on a problem on some environments where instructions like these don't work:

mov si, call_tbl      ; SI=Call table pointer
call [call_tbl]       ; Call print_char using near indirect absolute call
                      ; via memory operand
call [ds:call_tbl]    ; Call print_char using near indirect absolute call
                      ; via memory operand w/segment override
call near [si]        ; Call print_char using near indirect absolute call
                      ; via register

Each one of these happen to involve indirect near CALL to absolute memory offsets. I have discovered that I have issues if I use similar JMP tables. Calls and Jumps that are relative don't seem to be affected. Code like this works:

call print_char 

I have taken the advice presented on Stackoverflow by posters discussing the dos and don'ts of writing a bootloader. In particular I saw this Stackoverflow answer with General Bootloader Tips. The first tip was:

  1. When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x07c00 and that the boot drive number is loaded into the DL register.

Taking all the advice, I didn't rely on CS, I set up a stack, and set DS to be appropriate for the ORG (Origin offset) I used. I have created a Minimal Complete Verifiable example that demonstrates the problem. I built this using NASM, but it doesn't seem to be a problem specific to NASM.


Minimal Example

The code to test is as follows:

[ORG 0x7c00]
[Bits 16]

section .text
main:
    xor ax, ax
    mov ds, ax            ; DS=0x0000 since OFFSET=0x7c00
    cli                   ; Turn off interrupts for potentially buggy 8088
    mov ss, ax
    mov sp, 0x7c00        ; SS:SP = Stack just below 0x7c00
    sti                   ; Turn interrupts back on

    mov si, call_tbl      ; SI=Call table pointer
    mov al, [char_arr]    ; First char to print 'B' (beginning)
    call print_char       ; Call print_char directly (relative jump)

    mov al, [char_arr+1]  ; Character to print 'M' (middle)
    call [call_tbl]       ; Call print_char using near indirect absolute call
                          ; via memory operand
    call [ds:call_tbl]    ; Call print_char using near indirect absolute call
                          ; via memory operand w/segment override
    call near [si]        ; Call print_char using near indirect absolute call
                          ; via register

    mov al, [char_arr+2]  ; Third char to print 'E' (end)
    call print_char       ; Call print_char directly (relative jump)

end:
    cli
.endloop:
    hlt                   ; Halt processor
    jmp .endloop

print_char:
    mov ah, 0x0e    ; Write CHAR/Attrib as TTY
    mov bx, 0x00    ; Page 0
    int 0x10
    retn

; Near call address table with one entry
call_tbl: dw print_char

; Simple array of characters
char_arr: db 'BME'

; Bootsector padding
times 510-($-$$) db 0
dw 0xAA55

I build both an ISO image and a 1.44MB floppy image for test purposes. I'm using a Debian Jessie environment but most Linux distros would be similar:

nasm -f bin boot.asm -o boot.bin
dd if=/dev/zero of=floppy.img bs=1024 count=1440
dd if=boot.bin of=floppy.img conv=notrunc

mkdir iso    
cp floppy.img iso/
genisoimage -quiet -V 'MYBOOT' -input-charset iso8859-1 -o myos.iso -b floppy.img -hide floppy.img iso

I end up with a floppy disk image called floppy.img and an ISO image called myos.iso.


Expectations vs Actual Results

Under most conditions this code works, but in a number of environments it doesn't. When it works it simply prints this on the display:

BMMME

I print out B using a typical CALL with relative offset it seems to work fine. In some environments when I run the code I just get:

B

And then it appears to just stop doing anything. It seems to print out the B properly but then something unexpected happens.

Environments that seem to work:

  • QEMU booted with floppy and ISO
  • VirtualBox booted with floppy and ISO
  • VMWare 9 booted with floppy and ISO
  • DosBox booted with floppy
  • Officially packaged Bochs(2.6) on Debian Jessie using floppy image
  • Bochs 2.6.6(built from source control) on Debian Jessie using floppy image and ISO image
  • AST Premmia SMP P90 system from mid 90s using floppy and ISO

Environments that don't work as expected:

  • Officially packaged Bochs(2.6) on Debian Jessie using ISO image
  • 486DX based system with AMI BIOS from the early 90s using floppy image. CDs won't boot on this system so the CD couldn't be tested.

What I find interesting is that Bochs (version 2.6) doesn't work as expected on Debian Jessie using an ISO. When I boot from the floppy with the same version it works as expected.

In all cases the ISO and the floppy image seemed to load and start running since in ALL cases it was at least able to print out B on the display.


My Questions

  • When it fails, why does it only print out a B and nothing more?
  • Why do some environments work and others fail?
  • Is this a bug in my code or the hardware/BIOS?
  • How can I fix it so that I can still use near indirect Jump and Call tables to absolute memory offsets? I am aware I can avoid these instructions altogether and that seems to solve my problem, but I'd like to be able to understand how and if I can use them properly in a bootloader.
like image 849
Michael Petch Avatar asked Dec 31 '15 15:12

Michael Petch


1 Answers

The Problem

The answer to your question is buried in your question, it just isn't obvious. You quoted my General Bootloader Tips:

  1. When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.

Your code correctly sets up DS, and sets its own stack (SS, and SP). You didn't blindly copy CS to DS, but what you do do is rely on CS being an expected value (0x0000). Before I explain what I mean by that, I'd like to draw your attention to a recent Stackoverflow answer I gave about how the ORG directive (or the origin point specified by any linker) works together with the segment:offset pair used by the BIOS to jump to physical address 0x07c00.

The answer details how CS being copied to DS can cause problems when referencing memory addresses (variables for example). In the summary I stated:

Don't assume CS is a value we expect, and don't blindly copy CS to DS . Set DS explicitly.

The key thing is Don't assume CS is a value we expect. So your next question may be - I don't seem to be using CS am I? The answer is yes. Normally when you use a typical CALL or JMP instruction it looks like this:

call print_char
jmp somewhereelse

In 16 bit-code both of these are relative jumps. This means that you jump forward or back in memory but as an offset relative to the instruction right after the JMP or CALL. Where your code is placed within a segment doesn't matter as it is a plus/minus displacement from where you currently are. What the current value of CS is doesn't actually matter with relative jumps, so they should work as expected.

Your example of instructions that don't always seem to work correctly included:

call [call_tbl]       ; Call print_char using near indirect absolute call
                      ; via memory operand
call [ds:call_tbl]    ; Call print_char using near indirect absolute call
                      ; via memory operand w/segment override
call near [si]        ; Call print_char using near indirect absolute call
                      ; via register

All of these have one thing in common. The addresses that are CALLed or JMPed are ABSOLUTE, not relative. The offset of the label will be influenced by the ORG (origin point of the code). If we look at a disassembly of your code we will see this:

objdump -mi8086 -Mintel -D -b binary boot.bin --adjust-vma 0x7c00
boot.bin:     file format binary

Disassembly of section .data:

00007c00 <.data>:
    7c00:   31 c0                   xor    ax,ax
    7c02:   8e d8                   mov    ds,ax
    7c04:   fa                      cli
    7c05:   8e d0                   mov    ss,ax
    7c07:   bc 00 7c                mov    sp,0x7c00
    7c0a:   fb                      sti
    7c0b:   be 34 7c                mov    si,0x7c34
    7c0e:   a0 36 7c                mov    al,ds:0x7c36
    7c11:   e8 18 00                call   0x7c2c              ; Relative call works
    7c14:   a0 37 7c                mov    al,ds:0x7c37
    7c17:   ff 16 34 7c             call   WORD PTR ds:0x7c34  ; Near/Indirect/Absolute call
    7c1b:   3e ff 16 34 7c          call   WORD PTR ds:0x7c34  ; Near/Indirect/Absolute call
    7c20:   ff 14                   call   WORD PTR [si]       ; Near/Indirect/Absolute call
    7c22:   a0 38 7c                mov    al,ds:0x7c38
    7c25:   e8 04 00                call   0x7c2c              ; Relative call works
    7c28:   fa                      cli
    7c29:   f4                      hlt
    7c2a:   eb fd                   jmp    0x7c29
    7c2c:   b4 0e                   mov    ah,0xe              ; Beginning of print_char
    7c2e:   bb 00 00                mov    bx,0x0              ; function
    7c31:   cd 10                   int    0x10
    7c33:   c3                      ret
    7c34:   2c 7c                   sub    al,0x7c             ; 0x7c2c offset of print_char
                                                               ; Only entry in call_tbl
    7c36:   42                      inc    dx                  ; 0x42 = ASCII 'B'
    7c37:   4d                      dec    bp                  ; 0x4D = ASCII 'M'
    7c38:   45                      inc    bp                  ; 0x45 = ASCII 'E'
    ...
    7dfd:   00 55 aa                add    BYTE PTR [di-0x56],dl

I've manually added some comments where the CALL statements are, including both the relative ones that work and the near/indirect/absolute ones may not. I've also identified where the print_char function is, and where it was in the call_tbl.

From the data area after the code we do see that the call_tbl is at 0x7c34 and it contains a 2 byte absolute offset of 0x7c2c. This is all correct, but when you use an absolute 2-byte offset it is assumed to be in the current CS. If you have read this Stackoverflow answer (that I referenced earlier) about what happens when the wrong DS and offset is used to reference a variable, you might now realize that this may apply to JMPs CALLs that use absolute offsets involving NEAR 2-byte absolute values.

As an example let us take this call that doesn't always work:

call [call_tbl] 

call_tbl is loaded from DS:[call_tbl]. We properly set DS to 0x0000 when we start the bootloader so this does correctly retrieve the value 0x7c2c from memory address 0x0000:0x7c34. The processor will then set IP=0x7c2c BUT it assumes it is relative to the currently set CS. Since we can't assume CS is an expected value, the processor potentially can CALL or JMP to the wrong location. It all depends on what CS:IP the BIOS used to jump to our bootloader with (it can vary).

In the case where the BIOS does the equivalent of a FAR JMP to our bootloader at 0x0000:0x7c00, CS will be set to 0x0000 and IP to 0x7c00. When we encounter call [call_tbl] it would have resolved to a CALL to CS:IP=0x0000:0x7c2c . This is physical address (0x0000<<4)+0x7c2c=0x07c2c which is in fact where the print_char function in memory that the function physically starts at.

Some BIOSes do the equivalent of a FAR JMP to our bootloader at 0x07c0:0x0000, CS will be set to 0x07c0 and IP to 0x0000. This too maps to physical address (0x07c0<<4)+0=0x07c00 .When we encounter call [call_tbl] it would have resolved to a CALL to CS:IP=0x07c0:0x7c2c . This is physical address (0x07c0<<4)+0x7c2e=0x0f82c. This is clearly wrong since the print_char function is at physical address 0x07c2c, not 0x0f82c.

Having CS set incorrectly will cause problems for JMP and CALL instructions that do Near/Absolute addressing. As well any memory operands that use a segment override of CS:. An example of using the CS: override in a real mode interrupt handler can be found in this Stackoverflow answer


Solution

Since it has been shown that we can't rely on CS that is set when the BIOS jumps to our code we can set CS ourselves. To set CS we can do a FAR JMP to our own code which will set CS:IP to values that make sense for the ORG (origin point of the code and data) we are using. An example of such a jump if we use ORG 0x7c00:

jmp 0x0000:$+5

$+5 says to use an offset that is 5 above our current program counter. A far jmp is 5 bytes long so this has the affect of doing a far jump to the instruction after our jmp. It could have been coded this way too:

    jmp 0x0000:farjmp
farjmp:

When either of these instructions is complete CS will be set to 0x0000 and IP will be set to the offset of the next instruction. They key thing for us is that CS will be 0x0000. When paired with an ORG of 0x7c00 it will properly resolve absolute addresses so that they work properly when physically running on the CPU. 0x0000:0x7c00=(0x0000<<4)+0x7c00=physical address 0x07c00.

Of course if we use ORG 0x0000 then we need to set CS to 0x07c0. This is because (0x07c0<<4)+0x0000=0x07c00. So we could code the far jmp this way:

jmp 0x07c0:$+5

CS will be set to 0x07c0 and IP will be set to the offset of the next instruction.

The end result of all this is that we are setting CS to the segment we want, and not rely on a value that we can't guarantee when the BIOS finishes jumping to our code.


Issues with Different Environments

As we have seen the CS can matter. Most BIOSes whether in an emulator, virtual machine or real hardware do the equivalent of a far jump to 0x0000:0x7c00 and in those environments your bootloader would have worked. Some environment like older AMI Bioses and Bochs 2.6 when booting from a CD are starting our bootloader with CS:IP = 0x07c0:0x0000. As discussed in those environments near/absolute CALLs and JMPs will proceed to execute from the wrong memory locations and cause our bootloader to function incorrectly.

So what about Bochs working for a floppy image and not for an ISO image? This is a peculiarity in earlier versions of Bochs. When booting from a floppy the virtual BIOS jumps to 0x0000:0x7c00 and when it boots from an ISO image is uses 0x07c0:0x0000. This explains why it works differently. This odd behavior apparently came about because of literal interpretation of one of the El Torito specifications that specifically mentioned segment 0x07c0. Newer versions of Boch's virtual BIOSes were modified to use 0x0000:0x7c00 for both.


Does this Mean some BIOSes have a Bug?

The answer to this question is subjective. In the first versions of IBM's PC-DOS (prior to 2.1) the bootloader assumed that the BIOS jumped to 0x0000:0x7c00, but this wasn't clearly defined. Some BIOS manufacturers in the 80s started using 0x07c0:0x0000 and broke some early versions of DOS. When this was discovered bootloaders were modified to be well behaved as to not make any assumptions about what segment:offset pair was used to reach physical address 0x07c00. At the time one may have considered this a bug, but was based on the ambiguities introduced with 20-bit segment:offset pairs.

Since the mid 80s, it is my opinion that any new bootloader that assumes CS is a specific value has been coded in error.

like image 152
Michael Petch Avatar answered Sep 23 '22 15:09

Michael Petch