Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to get the signature of a function in a shared library programmatically?

The title is clear, we can loaded a library by dl_open etc..

But how can I get the signature of functions in it?

like image 276
Je Rog Avatar asked Jul 30 '11 05:07

Je Rog


2 Answers

This answer cannot be answered in general. Technically if you compiled your executable with exhaustive debugging information (code may still be an optimized, release version), then the executable will contain extra sections, providing some kind of reflectivity of the binary. On *nix systems (you referred to dl_open) this is implemented through DWARF debugging data in extra sections of the ELF binary. Similar it works for Mach Universal Binaries on MacOS X.

Windows PEs however uses a completely different format, so unfortunately DWARF is not truley cross plattform (actually in the early development stages of my 3D engine I implemented an ELF/DWARF loader for Windows, so that I could use a common format for the engines various modules, so with some serious effort such can be done).

If you don't want to go into implementing your own loaders, or debugging information accessors, then you may embed the reflection information through some extra symbols exported (by some standard naming scheme) which refer to a table of function names, mapping to their signature. In the case of C source files writing a parser to extract the information from the source file itself is rather trivial. C++ OTOH is so notoriously difficult to parse correctly, that you need some fully fledged compiler to get it right. For this purpose GCCXML was developed, technically a GCC that emits the AST in XML form instead of an object binary. The emitted XML then is much easier to parse.

From the extracted information create a source file with some kind of linked list/array/etc. structure describing each function. If you don't directly export each function's symbol but instead initialize some field in the reflection structure with the function pointer you got a really nice and clean annotated exporting scheme. Technically you could place this information in a spearate section of the binary as well, but putting it in the read only data section does the job as well, too.


However if you're given a 3rd party binary – say worst case scenario it has been compiled from C source, no debugging information and all symbols not externally referenced stripped – you're pretty much screwed. The best you could do, was applying some binary analysis of the way the function accesses the various places in which parameters can be passed.

This will only tell you the number of parameters and the size of each parameter value, but not the type or name/meaning. When reverse engineering some program (e.g. malware analysis or security audit), identifying the type and meaning of the parameters passed to functions is one of the major efforts. Recently I came across some driver I had to reverse for debugging purposes, and you cannot believe how astounded I was by the fact that I found C++ symbols in a Linux kernel module (you can't use C++ in the Linux kernel in a sane way), but also relieved, because the C++ name mangling provided me with plenty information.

like image 198
datenwolf Avatar answered Sep 20 '22 18:09

datenwolf


On Linux (or Mac) you can use a combination of "nm" and "c++filt" (for C++ libraries)

nm mylibrary.so | c++filt

or

nm mylibrary.a | c++filt

"nm" will give you the mangled form and "c++filt" attempts to put them in a more human-readable format. You might want to use some options in nm to filter down the results, especially if the library is large (or you can "grep" the final output to find a particular item)

like image 25
user3570671 Avatar answered Sep 20 '22 18:09

user3570671