Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Assembly On Mac

I'm using a MacBook Pro with an Intel Core 2 Duo processor at 2.53 GHz, but I was told Mac users must follow AT&T syntax (which adds to my confusion since I am running Intel) and x86 (not sure what this means exactly).

So I need to get into assembly but am finding it very hard to even begin. Searches online show assembly code that varies greatly in syntax and I can't find any resources that explain basic assembly how-tos. I keep reading about registers and a stack but don't understand how to look at this. Can anyone explain/point me in the right direction? Take, for example, this code which is the only code I found to work:

.data
_mystring:  .ascii "Hello World\n\0"    #C expects strings to terminate with a 0.
.text
    .globl _foo
_foo:
push    %ebp
    mov %esp,%ebp
    pushl   $_mystring  
    call    _myprint
    add $4,%esp
    pop %ebp
    ret

Very simple but what is it saying? I am having a confusing time understanding how this code does what it does. I know Java, PHP, and C, among other languages, but this, the steps and syntax of it, isn't clear to me. Here's the main file to go with it:

#include <stdio.h>
void foo();
void myprint(char *s)
{printf("%s", s);}
 main()
{foo();}

Also, there's this which just multiplies numbers:

.data
    .globl _cntr
_cntr:  .long 0
    .globl _prod
_prod:  .long 0
    .globl _x
_x: .long 0
    .globl _y
_y: .long 0
    .globl _mask
_mask:  .long 1
    .globl _multiply
multiply:
    push %ebp
    mov %ebp,%esp
    mov $0,%eax
    mov _x,%ebx
    mov _y,%edx
LOOP:
    cmp $0,%ebx
    je DONE
    mov %ebx,%ecx
    and $1,%ecx
    cmp $1,%ecx
    jne LOOPC
    add %edx,%eax
LOOPC:
    shr $1,%ebx
    shl $1,%edx
    jmp LOOP
DONE:
    pop %ebp
    ret

and the main.c to go with it:

#include <stdio.h>

extern int multiply();
extern int x, y;

int main()
{
    x = 34;
    y = 47;
    printf("%d * %d = %d\n", x, y, multiply());
}

And finally three small questions:

  1. What is the difference between .s and .h file names (I have both a main.c and main.h, which one is for what)?

  2. And why does assembly need a main.c to go with it/how does it call it?

  3. Can anyone recommend a good assembly IDE like Eclipse is for Java or PHP

Thanks to whomever answers (this is actually my first post on this site), I've been trying to figure this out for a few days and every resource I have read just doesn't explain the assembly logic to me. It says what .data or .text does but only someone who knows how to "think assembly" would understand what they mean? Also, if anyone is around New York City and feels very comfortable with Assembly and C I would love some private lessons. I feel there is a lot of potential with this language and would love to learn it.

like image 218
Airon Zagarella Avatar asked Oct 21 '11 17:10

Airon Zagarella


People also ask

Can you code in assembly on Mac?

Guess my surprise when I actually discovered that I could still program in Assembler using my newest MacBook Pro ! Yeaaah... you can build, debug and run machine code programs right from your Mac! Well, first you need to install the proper tools. Open the terminal application and enter the ld command.

Do programmers still use assembly?

Today, assembly language is still used for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems (see § Current usage).

What assembly does M1 Mac use?

The Apple M1 only supports ARM64 (also known as aarch64) assembly which is quite different from 32 bit ARM assembly. While you might be able to assemble 32 bit ARM programs with a suitable toolchain, you will not be able to run them. You need to check what is the default assembler and write assembly code accordingly.


1 Answers

Assembly language is a category of programming languages which are closely tied to CPU architectures. Traditionally, there is a one-to-one correspondence between each assembly instruction and the resulting CPU instruction.

There are also assembly pseudo-instructions which do not correspond to CPU instruction, but instead affect the assembler or the generated code. .data and .text are pseudo-instructions.

Historically, each CPU manufacturer implemented an assembly language as defined by their assembler, a source code translation utility. There have been thousands of specific assembly languages defined.

In modern times, it has been recognized that each assembly language shares a lot of common features, particularly with respect to pseudo-instructions. The GNU compiler collection (GCC) supports essentially every CPU architecture, so it has evolved generic assembly features.

x86 refers to the Intel 8086 family (8088, 8086, 8087, 80186, 80286, 80386, 80486, 80586 aka Pentium, 80686 aka Pentium II, etc.)

AT&T syntax is a notation style used by many assembly language architectures. A major feature is that instruction operands are written in the order from, to as was common historically. Intel syntax uses to, from operands. There are other differences as well.

As for your many questions, here are some resources which will 1) overwhelm you, and 2) eventually provide all your answers:

  • assembly language overview
  • tutorials and resources
  • x86 instruction summary
  • comprehensive x86 architecture reference

Ordinarily, an introductory assembly language programming class is a full semester with plenty of hands-on work. It assumes you are familiar with the basics of computer architecture. It is reasonable to expect that understanding the above material will take 300-500 hours. Good luck!

like image 193
wallyk Avatar answered Sep 30 '22 18:09

wallyk