I am learning how a C file is compiled to machine code. I know I can generate assembly from gcc
with the -S
flag, however it also produces a lot of code to do with main()
and printf()
that I am not interested in at the moment.
Is there a way to get gcc
or clang
to "compile" a function in isolation and output the assembly?
I.e. get the assembly for the following c in isolation:
int add( int a, int b ) {
return a + b;
}
There are two ways to do this for a specific object file:
-ffunction-sections
option to gcc
instructs it to create a separate ELF section for each function in the sourcefile being compiled.objdump
via the --start-address
/--stop-address
arguments.The first example:
$ readelf -S t.o | grep ' .text.' [ 1] .text PROGBITS 0000000000000000 00000040 [ 4] .text.foo PROGBITS 0000000000000000 00000040 [ 6] .text.bar PROGBITS 0000000000000000 00000060 [ 9] .text.foo2 PROGBITS 0000000000000000 000000c0 [11] .text.munch PROGBITS 0000000000000000 00000110 [14] .text.startup.mai PROGBITS 0000000000000000 00000180
This has been compiled with -ffunction-sections
and there are four functions, foo()
, bar()
, foo2()
and munch()
in my object file. I can disassemble them separately like so:
$ objdump -w -d --section=.text.foo t.o t.o: file format elf64-x86-64 Disassembly of section .text.foo: 0000000000000000 <foo>: 0: 48 83 ec 08 sub $0x8,%rsp 4: 8b 3d 00 00 00 00 mov 0(%rip),%edi # a <foo+0xa> a: 31 f6 xor %esi,%esi c: 31 c0 xor %eax,%eax e: e8 00 00 00 00 callq 13 <foo+0x13> 13: 85 c0 test %eax,%eax 15: 75 01 jne 18 <foo+0x18> 17: 90 nop 18: 48 83 c4 08 add $0x8,%rsp 1c: c3 retq
The other option can be used like this (nm
dumps symbol table entries):
$ nm -f sysv t.o | grep bar bar |0000000000000020| T | FUNC|0000000000000026| |.text $ objdump -w -d --start-address=0x20 --stop-address=0x46 t.o --section=.text t.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000020 <bar>: 20: 48 83 ec 08 sub $0x8,%rsp 24: 8b 3d 00 00 00 00 mov 0(%rip),%edi # 2a <bar+0xa> 2a: 31 f6 xor %esi,%esi 2c: 31 c0 xor %eax,%eax 2e: e8 00 00 00 00 callq 33 <bar+0x13> 33: 85 c0 test %eax,%eax 35: 75 01 jne 38 <bar+0x18> 37: 90 nop 38: bf 3f 00 00 00 mov $0x3f,%edi 3d: 48 83 c4 08 add $0x8,%rsp 41: e9 00 00 00 00 jmpq 46 <bar+0x26>
In this case, the -ffunction-sections
option hasn't been used, hence the start offset of the function isn't zero and it's not in its separate section (but in .text
).
Beware though when disassembling object files ...
This isn't exactly what you want, because, for object files, the call
targets (as well as addresses of global variables) aren't resolved - you can't see here that foo
calls printf
, because the resolution of that on binary level happens only at link time. The assembly source would have the call printf
in there though. The information that this callq
is actually to printf
is in the object file, but separate from the code (it's in the so-called relocation section that lists locations in the object file to be 'patched' by the linker); the disassembler can't resolve this.
The best way to go would be to copy your function in a single temp.c
C file and to compile it with the -c
flag like this: gcc -c -S temp.c -o temp.s
It should produce a more tighten assembly code with no other distraction (except for the header and footer).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With