Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the size of symbols in the symbol table of Mach-O file?

Tags:

mach-o

Before watch the mail list, I'm confused with the lack of "size" of symbol table in the Mach-o file. And I found the solution in source file posted in that E-Mail, which note that:

//Mach-O symbol table does have size in it
//so need to scan ahead to find symbol with next highest address.

But when I parse out the symbol table in a Mach-O file (I got the symbol table from the symtab_command and the following nlists) and trying to calculate the size of one global symbol as the same way, I was confused again when I compared the symbol table from the output of dwarfdump (dwarfdump -ae). The end address of the symbol in the symbol table from the dwarfdump is different from the result my program's output. Is there some problem with the symbol table I parsed out? Or is there some other way to work out it?

Some of the output from my program:

<start address> <section index>    <method>
0x0006d030        1                            ___arclite_objc_autoreleasePoolPop 
0x0006d048        1                            _patch_lazy_pointers 
0x0006d1f0        1                            ___arclite_objc_autoreleasePoolPush

The corresponding part of the output from dwarfdump:

0x0014a37b: [0x0006d030 - 0x0006d046) __arclite_objc_autoreleasePoolPop 
0x0014a122: [0x0006d048 - 0x0006d1ee) patch_lazy_pointers 
0x0014a3a0: [0x0006d1f0 - 0x0006d212) __arclite_objc_autoreleasePoolPush

So if I use the way in the "MachONormalizedFileToAtoms.cpp" to calculate the end address of the symbol (look ahead to find symbol with next highest address), the result must be different from the output of dwarfdump. And does anyone know how dwarfdump calculate it?

Thank you!

like image 691
Jalen chen Avatar asked Nov 23 '25 06:11

Jalen chen


1 Answers

From the answer by Nick Kledzik:

The compiler often aligns functions to start at aligned address (e.g. 8 or 16 bytes). So, there is padding bytes (usually NOPs) after the end of a function and before the start of the next function.

dwarfdump has access to the debug info which does have size info for functions. So dwarfdump can show the size of a function without the alignment padding at the end. Whereas the linker just looks at the next symbol address. There is not much point in the linker digging through the debug info to get a function’s true size, because when writing the output, the linker has to align the next function which would just add back the pad bytes.

I hope that can help others who has the same confusion.

like image 56
Jalen chen Avatar answered Nov 26 '25 18:11

Jalen chen