Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mac OS X 32-bit nasm assembly program using main and scanf/printf?

I have spent the entire day trying to get some simple programs compiled but so far very little luck. What I want to do is to compile and run programs written in nasm assembly.

I have upgraded to latest nasm (v2.10.09). So let me just jump into code since I do not know a lot about these things yet. Here is a chunk of assembly code that runs on linux using elf format and linked witch gcc (comments are my understanding of what is going on):

bits 32
extern printf
global main

section .data
    message db "Hello world!", 10, 0

section .text
main:
    pushad                      ;push all registers on stack
    push dword message                  ;push the string on stack
    call printf                 ;call printf
    add esp, 4                  ;clear stack
    popad                       ;pop all registers back
    ret                     ;return to whoever called me

Nothing too big. However how the hell am I supposed to get this to work on OS X? I cant even get it to compile/link in any way. If it compiles I cant link it (something about i386 and x86 which cant be linked together (I understand that but how to fix it?)). I have tried a dozen ways with no luck.

Further more how can I printf and scanf on OS X assembly?

Here is another futile attempt of a scanf and printf the value back (this one actually compiles and links - and even runs!):

[bits 32] ; why the []?

section .data
    input_string    db  "Enter limit: %d", 0
    output_string   db  "Value %d", 10, 0
    counter         dd  10
    limit           dd  0

;nasm -f macho -o test.o test.asm 
;ld -lc -o test -arch i386 test.o -macosx_version_min 10.7

section .text

global start
extern _printf
extern _scanf
extern _exit

start:
    push ebp                ;push stack base
    mov ebp, esp            ;base is now current top
    and esp, 0xFFFFFFF0     ;align the stack - WHY? I just googled this?

    sub esp, 16             ;16 bytes for variables - 16 to keep the stack "aligned". WHY?

    mov dword[esp], input_string         ;ordinary push just isint good nuff for mac... WHY?
    mov dword[esp + 4], limit
    call _scanf                          ;does scan something but doesnt print the message, it just scans and then prints the message

    mov eax, [limit]                      ;since i cant push this lets put it in eax first


    mov dword[esp + 8], output_string     ;push the output string. WHY again MOV?
    mov dword[esp + 12], eax              ;and the previusly "scanned" variable
    call _printf                          ;print it out

    mov dword[esp], 0       ;return value
    call _exit              ;return

Compiled it with: nasm -f macho -o test.o test.asm and linked it with d -lc -o test -arch i386 test.o -macosx_version_min 10.7. Doesnt work properly. On linux its super easy to to this scanf and printf thingie. What's up here? Can it be done simpler?

I do not want to add more stuff to this question since people sometimes see a big question and thing "meh, too long, wont read". But if anyone requests more info I'll do my best.

Please help me since I cant figure it out.

EDIT The first one compiles using nasm -f macho -o out.o test.asm but doest link using gcc -o test out.o or by using ld -lc -o test -arch i386 out.o -macosx_version_min 10.7 and appending flat -arch i386 doesnt solve it either. I would love if I could write that "linux like" assembly since I do not have to worry about stack alignment and stuff like that. gcc error says:

ld: warning: ignoring file out.o, file was built for i386 which is not the architecture being linked (x86_64): out.o
Undefined symbols for architecture x86_64:
  "_main", referenced from:
      start in crt1.10.6.o
ld: symbol(s) not found for architecture x86_64

and ld error is as follows:

Undefined symbols for architecture i386:
  "printf", referenced from:
      main in out.o
  "start", referenced from:
     -u command line option
ld: symbol(s) not found for architecture i386

Please help.

like image 770
Majster Avatar asked Nov 19 '13 21:11

Majster


People also ask

What are commands to run assembly language program using NASM?

Enter the following two commands: nasm -f win32 assembly. asm -o test.o. ld test.o -o assembly.exe.

What does scanf do in assembly?

sscanf takes as arguments a string to scan, a format string, and pointers to variables where to store the results of the scan. So, your 0 and 1 numbers would be stored on the corresponding variables, assuming the scan was successful. EAX contains the return value.


1 Answers

You're asking a lot of questions about your code, and you really don't understand the assembly code that's there.

Firstly, because of the way you're writing your code, the main routine is going to be the entry point of a C style program. Because of the way that mac os x linkages work; you're going to have to name it _main to match the name of the symbol being looked for by the linker as the default program entry point when it pulls in /usr/lib/crt1.o when producing the executable (if you do an nm of the file you'll see an entry like: U _main. Similarly, all the library routines start with a leading underscore so you have to use that prefix if you want to use them.

Secondly, the MAC OS calling convention requires a 16 byte alignment of the stack for all calls which means that you have to ensure that the stack pointer is aligned relevantly at each point. At the entry point of the main routine you already know that you're misaligned due to the return address being stored in the stack for returning from main. This means that if you want to make even a single call you're going to have to move the stack down by at least 12 bytes to make the call.

Armed with that piece of information, we're going to omit futzing around with the ebp, and just use esp exclusively for the purposes of the code.

This is assuming a prolog of:

bits 32
extern _printf
global _main

section .data
    message db "Hello world!", 10, 0

section .text
_main:

On entry into _main, realign the stack:

sub esp, 12

Next we store the address of the message into the address pointed to by esp:

mov dword[esp], message

Then we call printf:

call _printf

Then we restore the stack:

add esp, 12

Set the return code for main, and return:

mov eax, 0
ret

The ABI for MAC OS X uses eax as the return code for the routine as long as it fits in a register. Once you've compiled and linked the code:

nasm -f macho -o test.o test.asm 
ld -o test -arch i386 test.o -macosx_version_min 10.7 -lc /usr/lib/crt1.o

It runs and prints the message, exiting with a 0.

Next we're going to play around with your scanning and printing example.

Firstly, scanf only scans, you can't have a prompt in there; it's simply not going to work, so you have to split the prompt from the scanning. We've already shown you how to do the print, and now what we need to show is the scanf.

Set up the variables in the data section:

scan_string     db  "%d", 0
limit           dd  0

First store the address of scan_string in esp, and then store the address of limit in esp + 4, then call scanf:

mov dword[esp], scan_string
mov dword[esp + 4], limit
call _scanf

We now should have the value that we scanned stored in the limit memory location.

Next to print this message:

output_string   db  "Value %d", 10, 0

Next we put the address of output_string on the stack:

mov dword[esp], output_string

Read the value of the limit address into the eax register and put it into esp + 4 - i.e. the second parameter for printf:

mov eax, [limit]
mov dword[esp + 4], eax
call _printf

Next, we're calling exit, so we have to store the exit code in the stack and invoke the _exit function - this is different from the simple print variant as we're actually invoking exit, rather than simply returning.

mov dword[esp], 0
call _exit

As for some of the questions:

Why the alignment?

Because that's how Mac OS X does it

Why isn't push good enough?

It is, but we aligned the stack at the start of the routine and an aligned stack is a functioning stack, by pushing and popping you're messing with the alignment. This is one of the purposes behind using the ebp register rather than the esp register.

If we were to use the ebp register, the function prolog would look like:

push ebp
mov ebp, esp
sub esp, 8 ; in this case to obtain alignment

and the function epilog would look like:

add esp, 8
pop ebp

You can put in symmetric pusha/popa calls in there as well, but if you're not using the registers, why complicate the stack.

A better overview of the 32bit function calling mechanism is on the OS X Developer guide, The ABI function call guide gives the far more detail on the hows of parameter passing and returning. It's based on the AT&T System V ABI for the i386, with a few listed differences:

  • Different rules for returning structures
  • The stack is 16-byte aligned at the point of function calls
  • Large data types (larger than 4 bytes) are kept at their natural alignment
  • Most floating-point operations are carried out using the SSE unit instead of the x87 FPU, except when operating on long double values. (The IA-32 environment defaults to 64-bit internal precision for the x87 FPU.)
like image 143
Petesh Avatar answered Nov 15 '22 04:11

Petesh