Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing Hexadecimal Digits with Assembly [duplicate]

I'm trying to learn NASM assembly, but I seem to be struggling with what seems to simply in high level languages.

All of the textbooks which I am using discuss using strings -- in fact, that seems to be one of their favorite things. Printing hello world, changing from uppercase to lowercase, etc.

However, I'm trying to understand how to increment and print hexadecimal digits in NASM assembly and don't know how to proceed. For instance, if I want to print #1 - n in Hex, how would I do so without the use of C libraries (which all references I have been able to find use)?

My main idea would be to have a variable in the .data section which I would continue to increment. But how do I extract the hexadecimal value from this location? I seem to need to convert it to a string first...?

Any advice or sample code would be appreciated.

like image 535
BSchlinker Avatar asked Dec 28 '22 06:12

BSchlinker


2 Answers

First write a simple routine which takes a nybble value (0..15) as input and outputs a hex character ('0'..'9','A'..'F').

Next write a routine which takes a byte value as input and then calls the above routine twice to output 2 hex characters, i.e. one for each nybble.

Finally, for an N byte integer you need a routine which calls this second routine N times, once for each byte.

You might find it helpful to express this in pseudo code or an HLL such as C first, then think about how to translate this into asm, e.g.

void print_nybble(uint8_t n)
{
    if (n < 10) // handle '0' .. '9'
        putchar(n + '0');
    else // handle 'A'..'F'
        putchar(n - 10 + 'A');
}

void print_byte(uint8_t n)
{
    print_nybble(n >> 4); // print hi nybble
    print_nybble(n & 15); // print lo nybble
}

print_int16(uint16_t n)
{
    print_byte(n >> 8); // print hi byte
    print_byte(n & 255); // print lo byte
}
like image 182
Paul R Avatar answered Dec 31 '22 15:12

Paul R


Is this a homework assignment?

Bits is bits. Bit, Byte, word, double word, these are hardware terms, something instruction sets/assembler is going to reference. hex, decimal, octal, unsigned, signed, string, character, etc are manifestations of programming languages. Likewise .text, .bss, .data, etc are also manifestations of software tools, the instruction set doesnt care about one address being .data and one being .text, it is the same instruction either way. There are reasons why all of these programming language things exist, very good reasons sometimes, but dont get confused when trying to solve this problem.

To convert from bits to human readable ascii, you first need to know your ascii table, and bitwise operators, and, or, logical shift, arithmetic shift, etc. Plus load and store and other things.

Think mathmatically what it takes to get from some number in a register/memory into ascii hex. Say 0x1234 which is 0b0001001000110100. For a human to read it, yes you need to get it into a string for lack of a better term but you dont necessarily need to store four characters plus a null in adjacent memory locations in order to do something with it. It depends on your output function. Normally character based output entities boil down to a single output_char() of some sort called many times.

You could convert to a string but that is more work, for each ascii character you compute call some sort of single character based output function right then. putchar() is an example of a byte output character type function.

So for binary you want to examine one bit at a time and create a 0x30 or 0x31. For octal, 3 bits at a time and create 0x30 to 0x37. Hex is based on 4 bits at a time.

Hex has the problem that the 16 characters we want to use are not found adjacent to each other in the ascii table. So you use 0x30 to 0x39 for 0 to 9 but 0x41 to 0x46 or 0x61 to 0x66 for A to F depending on your preference or requirements. So for each nybble you might AND with 0xF, compare with 9 and ADD 0x30 or 0x37 (10+0x37 = 0x41, 11+0x37 = 0x42, etc).

Converting from bits in a register to an ascii representation of binary. If the bit in memory was a 1 show a 1 (0x31 ascii) of the bit was a 0 show a 0 (0x30 in ascii).

void showbin ( unsigned char x )
{
    unsigned char ra;

    for(ra=0x80;ra;ra>>=1)
    {
        if(ra&x) output_char(0x31); else output_char(0x30);
    }
}

It may seem logical to use unsigned char above, but unsigned int, depending on the target processor, could produce much better (cleaner/faster) code. but that is another topic

The above could look could look something like this in assembler (intentionally NOT using x86)

 ...
 mov r4,r0
 mov r5,#0x80
top:
 tst r4,r5
 moveq r0,#0x30
 movne r0,#0x31
 bl output_char
 mov r5,r5, lsr #1
 cmp r5,#0
 bne top
 ...

Unrolled is easier to write and going to be a bit faster, the tradeoff is more memory used

 ...
 tst    r4, #0x80
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 tst    r4, #0x40
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 tst    r4, #0x20
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 ...

Say you had 9 bit numbers and wanted to convert to octal. Take three bits at a time (remember humans read left to right so start with the upper bits) and add 0x30 to get 0x30 to 0x37.

...
mov r4,r0
mov r0,r4,lsr #6
and r0,r0,#0x7
add r0,r0,#0x30
bl output_char
mov r0,r4,lsr #3
and r0,r0,#0x7
add r0,r0,#0x30
bl output_char
and r0,r4,#0x7
add r0,r0,#0x30
bl output_char
...

A single (8 bit) byte in hex might look like:

...
mov r4,r0
mov r0,r4,lsr #4
and r0,r0,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
and r0,r4,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
...

Making a loop from 1 to N storing that value in memory and reading it from memory (.data), output in hex:

...
mov r4,#1
str r4,my_variable
...
top:
ldr r4,my_variable
mov r0,r4,lsr #4
and r0,r0,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
and r0,r4,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
...
ldr r4,my_variable
add r4,r4,#1
str r4,my_variable
cmp r4,#7 ;say N is 7
bne top
...
my_variable .word 0

Saving to ram is a bit of a waste if you have enough registers. Although with x86 you can operate directly on memory and dont have to go through registers.

x86 isnt the same as the above (ARM) assembler so it is left as an exercise of the reader to work out the equivalent. The point is, it is the shifting, anding, and adding that matter, break it down into elementary steps and the instructions fall out naturally from there.

like image 26
old_timer Avatar answered Dec 31 '22 15:12

old_timer