Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What code skeleton should I use for Intel 8086 DOS assembly?

Having learned Intel 8080 structure, I'm now trying to learn Intel 8086 and how the programs here are layed out. For now, it's quite intimidating even looking at the basic examples and what's worse, I can't get the difference between two ways of writing code for 8086 I've stumbled upon. Namely, sometimes i see:

.model small
.stack 100h
.code

start:

mov dl, ‘a’ ; store ascii code of ‘a’ in dl
mov ah, 2h ; ms-dos character output function
int 21h ; displays character in dl register
mov ax, 4c00h ; return to ms-dos
int 21h

end start

While I also found:

Progr           segment
                assume  cs:Progr, ds:dataSeg, ss:stackSeg

start:          mov     ax,dataSeg
                mov     ds,ax
                mov     ax,stackSeg
                mov     ss,ax
                mov     sp,offset top


            mov     ah,4ch
            mov     al,0
            int     21h
Progr           ends

dataSeg            segment

dataSeg            ends

stackSeg          segment
                dw    100h dup(0)
top     Label word
stackSeg          ends

end start

Obviously, I know that these two do very different things but what baffles me is how different the general syntax is. In the latter we have some "segment assume" while in the former it's just .model, .stack and .code (and sometimes .data, from what I found). Is there any difference? Can I just choose which one suits me better? The former looks a lot easier to understand and clearer but can I just use it instead of the latter?

like image 965
Straightfw Avatar asked Nov 28 '13 11:11

Straightfw


1 Answers

This depends massively on the operating system you target (or or BIOS or bare metal), the executable format you target, and the assembler you use.

The first example you posted is for MS-DOS .COM programs, the second for MS-DOS .EXE programs, and I assume both are using the Microsoft® assembler.

If you want to use the GNU assembler (e.g. on MirBSD or GNU/Linux) to target i8086 MS-DOS .COM programs, you can use this:

        .intel_syntax noprefix
        .code16
        .text

        .globl  _start
_start: mov     ah,9
        mov     dx,offset msg
        int     0x21
        /* exit(0); ← this is a comment */
        mov     ax,0x4C00
        int     0x21

msg:    .ascii  "Hello, World!\r\n$"

Compile this file (hw.S) with:

$ gcc -c -o hw.o hw.S
$ ld -nostdlib -Ttext 0x0100 -N -e _start -Bstatic --oformat=binary -o hw.com hw.o

I tested the result in DOSBOX under MirBSD/i386, and looked at it in hexdump to see that it’s correct.

In contrast to the other solutions, you do not define the origin (org) in the assembly file but on the linker (ld) command line, here.

I’ve also got an example targetting raw x86 BIOS and another one (bootsector for blocklists) and another one (bootsector for *.tar archives), in case you’re interested; they need different origins though, and they require an i386 CPU even though they use the 16-bit mode only.

You can’t do *.EXE files that way.

ELKS is also an interesting i8086 target, but I haven’t done much with it yet. Do make sure you get a GNU as version new enough to know the .intel_syntax noprefix mode though.

like image 95
mirabilos Avatar answered Nov 14 '22 01:11

mirabilos