Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Offset in nm symbol value?

Just to give you some context, here's what I'm trying to achieve: I am embedding a const char* in a shared object file in order to have a version string in the .so file itself. I am doing data analysis and this string enables me to let the data know which version of the software produced it. This all works fine.

The issue I am having is when I try to read the string out of the .so library directly. I tried to use

nm libSMPselection.so | grep _version_info

and get

000000000003d968 D __SMPselection_version_info

this is all fine and as expected (the char* is called _SMPselection_version_info). However I would have expected to now be able to open the file, seek to 0x3d968 and start reading my string, but all I get is garbage.

When I open the .so file and simply search for the contents of the string (I know how it starts), I can find it at address 0x2e0b4. At this address it's there, zero terminated and as expected. (I am using this method for now.)

I am not a computer scientist. Could someone please explain to me why the symbol value shown by nm isn't correct, or differently, what is the symbol value if it isn't the address of the symbol?

(By the way I am working on a Mac with OSX 10.7)

like image 848
Simon Avatar asked May 03 '12 11:05

Simon


1 Answers

Assuming its an ELF or similarily structured binary, you have to take into account the address where stuff is loaded, which is influenced by things in the ELF header.

Using objdump -Fd on your binary, you can have the disassembler also show the exact file offset of a symbol.

Using objdump -x you can find this loader address, usually 0x400000 for standard linux executables.

The next thing you have to be careful with is to see if its an indirect string, this you can do most easily by using objdump -g. When the string is found as being an indirect string, at the position output by objdump -Fd you will not find the string, but the address. From this you need to subtract the loader address again. Let me show you an example for one of my binaries:

objdump -Fd BIN | grep VersionString
  45152f:       48 8b 1d 9a df 87 00    mov    0x87df9a(%rip),%rbx        # ccf4d0 <acVersionString> (File Offset: 0x8cf4d0)

objdump -x BIN
...
LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
...

So we look at 0x8cf4d0 in the file and find in the hexeditor:

008C:F4D0 D8 C1 89 00  00 00 00 00  01 00 00 00  FF FF FF FF

So we take the 0x89C1D8 there, subtract 0x400000 and have 0x49c1d8 and when we look there in the hexeditor we find:

0049:C1D0 FF FF 7F 7F  FF FF 7F FF  74 72 75 6E  6B 5F 38 30
0049:C1E0 34 33 00 00  00 00 00 00  00 00 00 00  00 00 00 00

Which means "trunk_8043".

YMMV, especially when its some other file format, but that is the general way on how these things are structured, with lots of warts and details that deviate for special cases.

like image 86
PlasmaHH Avatar answered Sep 20 '22 23:09

PlasmaHH