Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between defining string as bytes (db) and defining strings as words/double words(dw/dd) in x86

I'm trying to investigate the difference between defining labels in assembly, here's an example

ALabel: db 'Testing'
AAnotherLabel: dw 'Testing'

now, let me load these into a 32 bit register:

mov eax, [ALabel]
mov ebx, [AAnotherLabel]

upon investigation with gdb I found all of child registers of eax and ebx contain the same values, look here:

info register eax
0x64636261 //dcba

info register ebx
0x64636261 //dcba

They are the same!

In Jeff Duntemann's book (Assembly Language step-by-step programming with Linux) He shows an example of words and double words into registers, but for some reason loads the offset (ie the address of the value like so)

DoubleString: dd 'Stop'
mov edx, DoubleString

Investigation into the contents of edx show that it contains an address, presumably an address of the first four letters in the string, as apposed to the address of just the first, although I am speculating here.

I would like clarification of what is really happening here and is this statement in fact loading the address of the first letter in the string into a register:

Fin: db 'Final'
mov ecx, Fin
like image 325
William Avatar asked Mar 18 '23 08:03

William


2 Answers

You are talking about 2 different things here.

Difference between db, dw, dd
Jester gave you already the correct answer. Here are two examples from the NASM manual, which should help you to understand it.

When you use dw, storage is created in steps of 1 word (2 bytes). Thus it can only have the size of 2, 4, 6, 8... and so bytes. In this example you have a string of 3 Bytes 'abc'. It only needs 3 Bytes, but because you used 'dw' it will be 4 bytes long. The 4. byte is filled with 0.

fin: dw 'abc'               ; 0x61 0x62 0x63 0x00 (string)

By using db instead of dw, you can create storage in 1 byte steps. This one will be 3 bytes long:

fin: db 'abc'               ; 0x61 0x62 0x63 (string)

they are called pseudo-instructions, because actually, that are commands at your assembler (in this case NASM), which tells him how to allocate your storage. It is not a code your processor has to read. Source:
3.2.1: http://www.nasm.us/doc/nasmdoc3.html

Brackets and no brackets
The other thing, you talked about, is using brackets [] or not. That is again something which concerns NASM's syntax. when you use no brackets, you tell NASM to work with the address. This will save the memory address in eax:

mov eax, fin

This will save the first 4 bytes from the memory adress in eax:

mov eax, [fin]

About your last question:

DoubleString: dd 'Stop'
mov edx, DoubleString

DoubleString, the memory address, where 'Stop' is saved is saved in edx. Every address corresponds to one byte. Thus the address DoubleString points directly to the letter 'S'. The address Doublestring+1 points to the next byte, where the letter 't' is stored and so on.

Source:
2.2.2: http://www.nasm.us/doc/nasmdoc2.html#section-2.2.2

like image 154
Blechdose Avatar answered Mar 30 '23 00:03

Blechdose


The only difference is the storage size. dw will always use multiple of 2 bytes, while dd will use 4.

Yes, your last two examples load the address.

like image 35
Jester Avatar answered Mar 29 '23 23:03

Jester