Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What parts of this HelloWorld assembly code are essential if I were to write the program in assembly?

I have this short hello world program:

#include <stdio.h>

static const char* msg = "Hello world";

int main(){
    printf("%s\n", msg);
    return 0;
}

I compiled it into the following assembly code with gcc:

    .file   "hello_world.c"
    .section    .rodata
.LC0:
    .string "Hello world"
    .data
    .align 4
    .type   msg, @object
    .size   msg, 4
msg:
    .long   .LC0
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $16, %esp
    movl    msg, %eax
    movl    %eax, (%esp)
    call    puts
    movl    $0, %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
    .section    .note.GNU-stack,"",@progbits

My question is: are all parts of this code essential if I were to write this program in assembly (instead of writing it in C and then compiling to assembly)? I understand the assembly instructions but there are certain pieces I don't understand. For instance, I don't know what .cfi* is, and I'm wondering if I would need to include this to write this program in assembly.

like image 720
Connor Avatar asked Sep 17 '16 18:09

Connor


People also ask

How hard is the Hello World program in Assembly?

The Hello World program is a lot harder in assembly than any high level programming language as it requires the programmer to define the program entirely such as specifying memory locations.

Does Hello world use Intel or AT&T assembly language?

Also, the code that we will write for the creation of the program will not link to a library and will not use any ELF interpreter. We will use both of these assemblers to show at assembly language level both Intel and AT&T syntax in the “Hello World” example.

How to write a macro in assembly language?

Writing a macro is another way of ensuring modular programming in assembly language. A macro is a sequence of instructions, assigned by a name and could be used anywhere in the program. In NASM, macros are defined with %macro and %endmacro directives. The macro begins with the %macro directive and ends with the %endmacro directive.

How to ensure modular programming in assembly language?

Writing a macro is another way of ensuring modular programming in assembly language. A macro is a sequence of instructions, assigned by a name and could be used anywhere in the program.


1 Answers

The absolute bare minimum that will work on the platform that this appears to be, is

        .globl main
main:
        pushl   $.LC0
        call    puts
        addl    $4, %esp
        xorl    %eax, %eax
        ret
.LC0:
        .string "Hello world"

But this breaks a number of ABI requirements. The minimum for an ABI-compliant program is

        .globl  main
        .type   main, @function
main:
        subl    $24, %esp
        pushl   $.LC0
        call    puts
        xorl    %eax, %eax
        addl    $28, %esp
        ret
        .size main, .-main
        .section .rodata
.LC0:
        .string "Hello world"

Everything else in your object file is either the compiler not optimizing the code down as tightly as possible, or optional annotations to be written to the object file.

The .cfi_* directives, in particular, are optional annotations. They are necessary if and only if the function might be on the call stack when a C++ exception is thrown, but they are useful in any program from which you might want to extract a stack trace. If you are going to write nontrivial code by hand in assembly language, it will probably be worth learning how to write them. Unfortunately, they are very poorly documented; I am not currently finding anything that I think is worth linking to.

The line

.section    .note.GNU-stack,"",@progbits

is also important to know about if you are writing assembly language by hand; it is another optional annotation, but a valuable one, because what it means is "nothing in this object file requires the stack to be executable." If all the object files in a program have this annotation, the kernel won't make the stack executable, which improves security a little bit.

(To indicate that you do need the stack to be executable, you put "x" instead of "". GCC may do this if you use its "nested function" extension. (Don't do that.))

It is probably worth mentioning that in the "AT&T" assembly syntax used (by default) by GCC and GNU binutils, there are three kinds of lines: A line with a single token on it, ending in a colon, is a label. (I don't remember the rules for what characters can appear in labels.) A line whose first token begins with a dot, and does not end in a colon, is some kind of directive to the assembler. Anything else is an assembly instruction.

like image 108
zwol Avatar answered Sep 28 '22 04:09

zwol