Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need guidance on understanding basic assembly

I've dabbled in and out of trying to get a grasp on how to do some simple programming in assembly. I am going over a tutorial hello world program and most of the stuff they have explained makes sense, but they are really glossing over it. I would like some help in understanding some different parts of the program. Here is their tutorial example -

section .text
    global main     ;must be declared for linker (ld)
main:               ;tells linker entry point
    mov edx,len     ;message length
    mov ecx,msg     ;message to write
    mov ebx,1       ;file descriptor (stdout)
    mov eax,4       ;system call number (sys_write)
    int 0x80        ;call kernel

    mov eax,1       ;system call number (sys_exit)
    int 0x80        ;call kernel

section .data
msg db 'Hello, world!', 0xa  ;our dear string
len equ $ - msg              ;length of our dear string

There is the text section and the data section. The data section seems to hold our user defined info for the program. It looks like the "frame" of the program is in the text section and the "meat" is in the data section... ? I assume the program when compiled executes the text section with data from the data section filled into the text section? The bss/text/data section interaction is kind of foreign to me. Also in the data section where the msg and len.... variables? are mentioned, they are followed by some information i'm not sure what to make of. msg is followed by db, what does this mean? Then the text, and then 0xa, what is the 0xa for? Also len is followed by equ, does this mean equals? len equals dollarsign minus msg variable? What is the dollar sign? A sort of operator? Also the instructions in the text section, mov ebx,1 apparently, or seems to tell the program to utilize STDOUT? Is moving 1 to the ebx register a standard instruction for setting stdout?

Perhaps someone has a little more thorough tutorial to recommend? I am looking to get dirty with assembly and need to teach myself some of the... "core fundamentals" if you will. Thanks for all the help!

like image 333
0xhughes Avatar asked Jun 12 '13 20:06

0xhughes


People also ask

What are the assembly instructions?

An assembler instruction is a request to the assembler to do certain operations during the assembly of a source module; for example, defining data constants, reserving storage areas, and defining the end of the source module.

What are the 4 parts of an assembly language statement?

Each source statement may include up to four fields: a label, an operation (instruction mnemonic or assembler directive), an operand, and a comment. The following are examples of an assembly directive and a regular machine instruction.

What is MSG DB in assembly language?

msg is a global variable pointing to the string that follows - the db means data byte , indicating that the assembler should just emit the literal bytes that follow.


1 Answers

[NB - I don't know what assembler dialect you're using, so I just took some "best guesses" at some parts of this stuff. If someone can help clarify, that would be great.]

It looks like the "frame" of the program is in the text section and the "meat" is in the data section... ?

The text section contains the executable instructions that make up your program. The data section contains the data that said program is going to operate on. The reason there are two different sections is to allow the program loader and operating system to be able to provide you with some protections. The text section can be loaded in to read-only memory, for example, and the data section can be loaded into memory marked as "non-executable", so code isn't accidentally (or maliciously) executed from that region.

I assume the program when compiled executes the text section with data from the data section filled into the text section?

The program (instructions in the text section) normally references symbols and manipulates data in the data section, if that's what you're asking.

The bss/text/data section interaction is kind of foreign to me.

The BSS section is similar to the data section, except it's all zero-initialized. That means it doesn't need to actually take up space in the executable file. The program loader just has to make an appropriately sized block of zero bytes in memory. Your program doesn't have a BSS section.

Also in the data section where the msg and len.... variables? are mentioned, they are followed by some information i'm not sure what to make of. msg is followed by db, what does this mean?

msg and len are variables of a sort, yes. msg is a global variable pointing to the string that follows - the db means data byte, indicating that the assembler should just emit the literal bytes that follow. len is being set to the length of the string (more below).

Then the text, and then 0xa, what is the 0xa for?

0x0a is the hexadecimal value of an ASCII newline character.

Also len is followed by equ, does this mean equals?

Yes.

len equals dollarsign minus msg variable? What is the dollar sign? A sort of operator?

The $ means "the current location". As the assembler is going about its job, it keeps track of how many bytes of data and code it's generated in a counter. So this code is saying: "subtract the location of the msg label from the current location and store that number as len". Since the "current location" is just past the end of the string, you get the length there.

Also the instructions in the text section, mov ebx,1 apparently, or seems to tell the program to utilize STDOUT? Is moving 1 to the ebx register a standard instruction for setting stdout?

The program is making a system call via the int 0x80 instruction. Before that, it has to set things up in a way the OS expects - in this case that looks like putting a 1 in ebx1 to mean stdout, along with the other three registers - the message length in edx, a pointer to the message in ecx, and the system call number in eax. I'd guess you're on linux - you can look up a system call table from google without too much trouble, I'm sure.

Perhaps someone has a little more thorough tutorial to recommend?

Sorry, not off the top of my head.

like image 165
Carl Norum Avatar answered Oct 05 '22 23:10

Carl Norum