How are dw and dd different from db directives for strings?

Tags:

Let's say I want to define a initialized variable string before running my assembly program (in section .data). The variable I chose to create is called Digits and it is a string that contains all the hexadecimal symbols.

Digits: db "0123456789ABCDEF"

I defined the variable with db, that means define byte. Does this mean that the Digits variable is of 8-bits long? This doesn't seem to have sense for me because:

Each character in the string is an ASCII character, therefore I will need 2 bytes for each character. In total, I would need 32 bytes for the whole string!

So what does it mean when I define the variable as byte? Word? Double word? I don't see the difference. Because of my misunderstanding, it seems to be redundant to tell the type of data you need for the string.

PD: This question didn't help me to understand.

271

asked Aug 09 '16 20:08

Pichi Wuana

1 Answers

NASM answer, MASM is totally different

One of the answers on the linked question has a quote from the NASM manual's examples which does answer your question. As requested, I'll expand on it for all three cases (and correct the lower-case vs. upper-case ASCII encoding error!):

db   'ABCDE'     ; 0x41 0x42 0x43 0x44 0x45                (5 bytes)
dw   'ABCDE'     ; 0x41 0x42 0x43 0x44 0x45 0x00           (6 bytes, 3 words)
dd   'ABCDE'     ; 0x41 0x42 0x43 0x44 0x45 0x00 0x00 0x00 (8 bytes, 2 doublewords)
dq   'ABCDE'     ; 0x41 0x42 0x43 0x44 0x45 0x00 0x00 0x00 (8 bytes, 1 quadword)

So the difference is that it pads out to a multiple of the element size with zeros when you use dd or dw instead of db.

According to @Jose's comment, some assemblers may use a different byte order for dd or dw string constants. In NASM syntax, the string is always stored in memory in the same order it appears in the quoted constant.

You can assemble this with NASM (e.g. into the default flat binary output) and use hexdump -C or something to confirm the byte ordering and amount of padding.

Note that this padding to the element size applies to each comma-separated element. So the seemingly-innocent dd '%lf', 10, 0 actually assembles like this:

;dd   '%lf',    10,        0
db    '%lf',0,  10,0,0,0,  0,0,0,0        ;; equivalent with db

Note the 0 before the newline; if you pass a pointer to this to printf, the C string is just "%lf", terminated by the first 0 byte.

(write system call or fwrite function with an explicit length would print the whole thing, including the 0 bytes, because those functions work on binary data, not C implicit-length strings.)

Also note that in NASM, you can do stuff like mov dword [rdi], "abc" to store "abc\0" to memory. i.e. multi-character literals work as numeric literals in any context in NASM.

MASM is very different

See When using the MOV mnemonic to load/copy a string to a memory register in MASM, are the characters stored in reverse order? for more. Even in a dd "abcd", MASM breaks your strings, reversing the byte order inside chunks compared to source order.

184

answered Nov 13 '22 18:11

Peter Cordes

Related questions
                            
                                LINQ contains one match from array of strings
                            
                                python "string" module?
                            
                                Concatenating __DIR__ constant with a string as an array value which is a class member in PHP
                            
                                Is it possible to constrain a generic type parameter to String OR Array
                            
                                Retaining the pattern characters while splitting via Regex, Ruby
                            
                                What does String.substring exactly do in Java?
                            
                                Accessing a variable using a string containing the variable's name [duplicate]
                            
                                As3 Split on a Carriage Return
                            
                                scanf dynamic allocation
                            
                                string := const : why different implementation for local and result?
                            
                                Evenly distribute repetitive strings
                            
                                PHP - String Logic Parsing - "X AND Y OR Z"
                            
                                Implementing a "string pool" that is guaranteed not to move
                            
                                Find number of characters mutual between two strings in C#
                            
                                data.table::fread's stringsAsFactors=TRUE argument doesn't convert character columns to factor type - what's the workaround?
                            
                                Python string formatting - old `%` vs new `str.format`
                            
                                New line character in output file
                            
                                sort: string comparison failed Invalid or incomplete multibyte or wide character
                            
                                Replacement for enumerateSubstringsInRange in Swift 3
                            
                                Difference between String interpolation and String initializer in Swift

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How are dw and dd different from db directives for strings?

Tags:

string

x86

assembly

nasm

masm