Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assembly language programming hints and tips [closed]

Tags:

x86

assembly

nasm

I'm having a go at writing my own "toy" OS and for the moment I'm doing it mostly in assembly (NASM) - partly because I'm hoping it will help me understand x86 disassembly and also because I'm finding it fairly fun too!

This is my first experience programming in assembly - I'm picking things up quicker than I expected, however as with learning any significantly different language I'm finding that my code is structured fairly chaotically as I try to figure out what patterns and conventions I should be using.

At the moment in particular I'm struggling with:

Keeping track of registers

At the moment everything is in 16 bit mode and so I only have 6 general purpose registers to play with, with even fewer of those usable for accessing memory. I keep on trampling over my own registers which in turn means I'm frequently swapping registers around to avoid this - consequently I'm having a hard time keeping track of what registers contain what values, even with liberal commenting. Is this normal? Is there anything I can do to help make things easier to keep track of?

For example I've started commenting all of my functions with a list of the registers that are clobbered:

; ================
; c_lba_chs
; Converts logical block addressing to Cylinder / Head / Selector
;  ax (input, clobbered) - LBA
;  ch (output) - Track number (cylinder)
;  cl (output) - Sector number
;  dh (output) - Head number
; ================

Keeping track of the stack

In a couple of cases I've started using the stack when I run out of registers, but this is making things so much worse - anything more complex than a simple push call pop sequence to preserve registers causes me to loose track completely, making it tricky to even tell if I've got the right number of items on the stack (particularly when error handling is involved - see below), let alone what order they are in. I know there must be a better way to use the stack, I just can't see what it is.

Handling errors

I've been using the carry flag and zero flag (depending on the function) to indicate an error to the caller, for example:

myfn:
    ; Do things
    jz .error
    ; Do more things
    ret

    .error:
        stc
        ret

Is this a normal way of indicating errors?

Also are there any other hints or tricks that I can use to better structure my assembly?

Finally are there any good resources / examples of well-written assembly? I've come across The Art of Assembly Language Programming however it seems to focus very much on the nitty-gritty of the language with less emphasis on how code should be structured. (Also some of the code samples use segments, which I think I should be avoiding).

I'm doing all of this using zero segments (a flat memory model) to keep things simple and to make things easier if / when I start using C.

like image 262
Justin Avatar asked May 10 '11 14:05

Justin


1 Answers

Dont worry, you are pretty much on the right track. Being assembly you can do what you want so you have the freedom to decide how you want to manage your registers and data. I would recommend developing some standard for yourself, and using a C like standard may not be a bad idea. I would also recommend using a different assembly language for a first project like this (for example ARM running on qemu), x86 is somewhat horriable as instruction sets go. but that is a separate topic...

The assemblers generally let you declare variables if you will, memory with names:

bob: .word 0x1234

Then from assembler (using ARM asm here)

ldr r0,bob
add r0,#1
str r0,bob

The registers are used temporarily, the real data is kept in memory. A model like this can help keep track of things as the real data is kept in memory with user created variable names just like a high level language. x86 makes this even easier as you can perform operations on memory and not have to go through registers for everything. Likewise you can manage this with a stack frame for local variables subtract some number from the stack to cover your stack frame for that function, and within that function know/remember that variable joe is stack pointer +4 and ted is stack pointer +8, etc. Probably use comments in your code to remember where these things are. Remembering to restore the stack pointer/frame to its entry point before returning. This method is a little harder as you are not using variable names but numerical offsets. But provides local variables and recursion and/or some global memory savings.

Doing this work as a human with your eyes and hands (keyboard and mouse) you probably want to keep data in a register no longer than what can fit on the screen on your text editor at one time, at a glance see the variable go to the register then return to the variable in memory all in one glance. A program/compiler certainly can keep track for as much memory as it has in the system, far greater than a human. Which is why compilers on average generate better assembler than humans (specific cases humans can always tweak or fix a problem).

Error handling, you need to be careful with using flags, it doesnt feel right to me for some reason. It may very well be just fine, interrupts preserve the flags, your code will all have to preserve or set the flags, etc. Hmm, the problem with flags is you have to check/use that return value immediately after the function returns, before you have an instruction that modifies the flags. where if you use a register you can choose not to modify that return register for many more instructions before you need to sample or use that return value.

I think the bottom line here is, look at the C calling convention rules that compilers use for that instruction set, and perhaps other instruction sets, you will see strong similarities and for good reason. They are manageable. With so few registers you can see why the calling conventions sometimes go straight to the stack for all of the arguments and sometimes the return values as well. I am told that the Amiga bios had a custom calling convention for each bios function, which made for a tight and fast executing system, but whey trying to re-create the bios in C using compilers and or attach to functions with an assembler wrapper it is difficult at best. I am sure without good documentation on each function, it is unmanageable. Down the road you may decide you want this portable and may wish you had chosen a commonly used calling convention. You still will want to comment your code to say parameter 1 is this and parameter 2 is that, etc. On the other hand if you are currently or in the past have programmed x86 assembler calling DOS and BIOS calls you would be quite comfortable with looking up each function in a reference, and placing the data in the proper registers for each function. Because there was good reference materials, it was manageable to have each function custom.

like image 198
old_timer Avatar answered Sep 20 '22 03:09

old_timer