Having learned Intel 8080 structure, I'm now trying to learn Intel 8086 and how the programs here are layed out. For now, it's quite intimidating even looking at the basic examples and what's worse, I can't get the difference between two ways of writing code for 8086 I've stumbled upon. Namely, sometimes i see:
.model small
.stack 100h
.code
start:
mov dl, ‘a’ ; store ascii code of ‘a’ in dl
mov ah, 2h ; ms-dos character output function
int 21h ; displays character in dl register
mov ax, 4c00h ; return to ms-dos
int 21h
end start
While I also found:
Progr segment
assume cs:Progr, ds:dataSeg, ss:stackSeg
start: mov ax,dataSeg
mov ds,ax
mov ax,stackSeg
mov ss,ax
mov sp,offset top
mov ah,4ch
mov al,0
int 21h
Progr ends
dataSeg segment
dataSeg ends
stackSeg segment
dw 100h dup(0)
top Label word
stackSeg ends
end start
Obviously, I know that these two do very different things but what baffles me is how different the general syntax is. In the latter we have some "segment assume" while in the former it's just .model, .stack and .code (and sometimes .data, from what I found). Is there any difference? Can I just choose which one suits me better? The former looks a lot easier to understand and clearer but can I just use it instead of the latter?
This depends massively on the operating system you target (or or BIOS or bare metal), the executable format you target, and the assembler you use.
The first example you posted is for MS-DOS .COM programs, the second for MS-DOS .EXE programs, and I assume both are using the Microsoft® assembler.
If you want to use the GNU assembler (e.g. on MirBSD or GNU/Linux) to target i8086 MS-DOS .COM programs, you can use this:
.intel_syntax noprefix
.code16
.text
.globl _start
_start: mov ah,9
mov dx,offset msg
int 0x21
/* exit(0); ← this is a comment */
mov ax,0x4C00
int 0x21
msg: .ascii "Hello, World!\r\n$"
Compile this file (hw.S
) with:
$ gcc -c -o hw.o hw.S
$ ld -nostdlib -Ttext 0x0100 -N -e _start -Bstatic --oformat=binary -o hw.com hw.o
I tested the result in DOSBOX under MirBSD/i386, and looked at it in hexdump to see that it’s correct.
In contrast to the other solutions, you do not define the origin (org) in the assembly file but on the linker (ld) command line, here.
I’ve also got an example targetting raw x86 BIOS and another one (bootsector for blocklists) and another one (bootsector for *.tar archives), in case you’re interested; they need different origins though, and they require an i386 CPU even though they use the 16-bit mode only.
You can’t do *.EXE files that way.
ELKS is also an interesting i8086 target, but I haven’t done much with it yet. Do make sure you get a GNU as version new enough to know the .intel_syntax noprefix
mode though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With