I've been trying to dump all of the imported API function calls for a set of PE files.
I have noticed that the majority of the PE files have a set of "weird" looking import functions. These are greatly increasing my number of unique function calls, even though I feel like alot of them are the same function calls.
Upon further research, I found out this is due to name-mangling and I am currently looking for a solution to be able to get the original function call names (in the sense that its a bit more readable and perhaps this could reduce my number of unique function calls) in Python if its possible rather than in C++.
Some examples of what I'm getting:
?underflow@?$basic_streambuf@DU?$char_traits@D@std@@@std@@MAEHXZ
?setbuf@?$basic_streambuf@DU?$char_traits@D@std@@@std@@MAEPAV12@PAD_J@Z
??0exception@@QAE@ABQBD@Z
??0exception@@QAE@ABQBDH@Z
??0exception@@QAE@ABV0@@Z
??1exception@@UAE@XZ
versus
RegDeleteValueW
RegEnumKeyExW
RegCloseKey
RegQueryValueExW
RegSetValueExW
Demangling C++ symbols is not easy in general. There are various "styles" and other complexities.
One option is to use command line tool. On Windows it is undname, on *nix you can use nm
, demangle
, c++filt
and other utilities.
Another option is to try to use compiler code that implements demangling. LLVM, for instance, has built-in Itanium ABI demangler. There should something like that for GCC too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With