Suppose I have two functions with the same parameter types and name (not in the same program):
std::string foo(int x) {
return "hello";
}
int foo(int x) {
return x;
}
Will they have the same mangled name once compiled?
Is the the return type part of the mangled name in C++?
Name mangling is a term that denotes the process of mapping a name that is valid in a particular programming language to a name that is valid in the CORBA Interface Definition Language (IDL).
In C, names may not be mangled as it doesn't support function overloading. So how to make sure that name of a symbol is not changed when we link a C code in C++. For example, see the following C++ program that uses printf() function of C.
The linker has no idea about Rust. It only knows about C, where functions with identical names collide. Even C++ requires name mangling to work around this. In fact the Rust name mangling is derived from the C++ one.
Name mangling is commonly used to facilitate the overloading feature and visibility within different scopes. The compiler generates function names with an encoding of the types of the function arguments when the module is compiled.
As mangling schemes aren't standardised, there's no single answer to this question; the closest thing to an actual answer would be to look at mangled names generated by the most common mangling schemes. To my knowledge, those are the GCC and MSVC schemes, in alphabetical order, so...
To test this, we can use a simple program.
#include <string>
#include <cstdlib>
std::string foo(int x) { return "hello"; }
//int foo(int x) { return x; }
int main() {
// Assuming executable file named "a.out".
system("nm a.out");
}
Compile and run with GCC or Clang, and it'll list the symbols it contains. Depending on which of the functions is uncommented, the results will be:
// GCC:
// ----
std::string foo(int x) { return "hello"; } // _Z3fooB5cxx11i
// foo[abi:cxx11](int)
int foo(int x) { return x; } // _Z3fooi
// foo(int)
// Clang:
// ------
std::string foo(int x) { return "hello"; } // _Z3fooi
// foo(int)
int foo(int x) { return x; } // _Z3fooi
// foo(int)
The GCC scheme contains relatively little information, not including return types:
_Z
for "function".3foo
for ::foo
.i
for int
.Despite this, however, they are different when compiled with GCC (but not with Clang), because GCC indicates that the std::string
version uses the cxx11
ABI.
Note that it does still keep track of the return type, and make sure signatures match; it just doesn't use the function's mangled name to do so.
To test this, we can use a simple program, as above.
#include <string>
#include <cstdlib>
std::string foo(int x) { return "hello"; }
//int foo(int x) { return x; }
int main() {
// Assuming object file named "a.obj".
// Pipe to file, because there are a lot of symbols when <string> is included.
system("dumpbin/symbols a.obj > a.txt");
}
Compile and run with Visual Studio, and a.txt
will list the symbols it contains. Depending on which of the functions is uncommented, the results will be:
std::string foo(int x) { return "hello"; }
// ?foo@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@H@Z
// class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl foo(int)
int foo(int x) { return x; }
// ?foo@@YAHH@Z
// int __cdecl foo(int)
The MSVC scheme contains the entire declaration, including things that weren't explicitly specified:
foo@
for ::foo
, followed by @
to terminate.@
.Y
for "non-member function".A
for __cdecl
.H
for int
.?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@
(followed by @
to terminate) for std::basic_string<char, std::char_traits<char>, std::allocator<char>>
(std::string
for short).H
for int
(followed by @
to terminate).Z
for throw(...)
; this one is omitted from demangled names unless it's something else, probably because MSVC just ignores it anyway.This allows it to whine at you if declarations aren't identical across every compilation unit.
Generally, most compilers will use one of those schemes (or sometimes a variation thereof) when targeting *nix or Windows, respectively, but this isn't guaranteed. For example...
Schemes used by other compilers are thanks to Agner Fog's PDF.
Examining the generated symbols, it becomes apparent that GCC's mangling scheme doesn't provide the same level of protection against Machiavelli as MSVC's. Consider the following:
// foo.cpp
#include <string>
// Simple wrapper class, to avoid encoding `cxx11 ABI` into the GCC name.
class MyString {
std::string data;
public:
MyString(const char* const d) : data(d) {}
operator std::string() { return data; }
};
// Evil.
MyString foo(int i) { return "hello"; }
// -----
// main.cpp
#include <iostream>
// Evil.
int foo(int);
int main() {
std::cout << foo(3) << '\n';
}
If we compile each source file separately, then attempt to link the object files together...
MyString
, due to not being part of the cxx11
ABI, causes MyString foo(int)
to be mangled as _Z3fooi
, just like int foo(int)
. This allows the object files to be linked, and an executable is produced. Attempting to run it causes a segfault.?foo@@YAHH@Z
; as we instead supplied ?foo@@YA?AVMyString@@H@Z
, linking will fail.Considering this, a mangling scheme that includes the return type is safer, even though functions can't be overloaded solely on differences in return type.
No, and I expect that their mangled name will be the same with all modern compilers. More importantly, using them in the same program results in undefined behavior. Functions in C++ cannot differ only in their return type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With