Here is an example found via an assembly website. This is the C code:
int main()
{
int a = 5;
int b = a + 6;
return 0;
}
Here is the associated assembly code:
(gdb) disassemble
Dump of assembler code for function main:
0x0000000100000f50 <main+0>: push %rbp
0x0000000100000f51 <main+1>: mov %rsp,%rbp
0x0000000100000f54 <main+4>: mov $0x0,%eax
0x0000000100000f59 <main+9>: movl $0x0,-0x4(%rbp)
0x0000000100000f60 <main+16>: movl $0x5,-0x8(%rbp)
0x0000000100000f67 <main+23>: mov -0x8(%rbp),%ecx
0x0000000100000f6a <main+26>: add $0x6,%ecx
0x0000000100000f70 <main+32>: mov %ecx,-0xc(%rbp)
0x0000000100000f73 <main+35>: pop %rbp
0x0000000100000f74 <main+36>: retq
End of assembler dump.
I can safely assume that this line of assembly code:
0x0000000100000f6a <main+26>: add $0x6,%ecx
correlates to this line of C:
int b = a + 6;
But is there a way to extract which lines of assembly are associated to the specific line of C code?
In this small sample it's not too difficult, but in larger programs and when debugging a larger amount of code it gets a bit cumbersome.
You can't deterministically convert assembly code to C. Interrupts, self modifying code, and other low level things have no representation other than inline assembly in C. There is only some extent to which an assembly to C process can work.
Both these terms are relevant in context to program execution. The compiler considers the entire code and one and converts it at the same time. Whereas, the assembler, converts the code line by line.
The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal. It cannot appear by itself. It must be followed by an assembly instruction, a group of instructions enclosed in braces, or, at the very least, an empty pair of braces.
But is there a way to extract which lines of assembly are associated to the specific line of C code?
Yes, in principle - your compiler can probably do it (GCC option -fverbose-asm
, for example). Alternatively, objdump -lSd
or similar will disassemble a program or object file with source and line number annotations where available.
In general though, for a large optimized program, this can be very hard to follow.
Even with perfect annotation, you'll see the same source line mentioned multiple times as expressions and statements are split up, interleaved and reordered, and some instructions associated with multiple source expressions.
In this case, you just need to think about the relationship between your source and the assembly, but it takes some effort.
One of the best tools I've found for this is Matthew Godbolt's Compiler Explorer.
It features multiple compiler toolchains, auto-recompiles, and it immediately shows the assembly output with colored lines to show the corresponding line of source code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With