Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't the g++ linker warn about this inconsistent function declaration?

This was tested on Debian squeeze with g++ 4.4 and g++ 4.7. Consider two C++ source files.

################
foo.cc
#################
#include <string>
using std::string;

int foo(void)
{
  return 0;
}

#################
bar.cc
#################
#include <string>
using std::string;

//int foo(void);
string foo(void);

int main(void)
{
  foo();
  return 0;
}
##################

If I compile and run this, predictably there are problems. I'm using scons.

################################
SConstruct
################################
#!/usr/bin/python


env = Environment(
    CXX="g++-4.7",
    CXXFLAGS="-Wall -Werror",
    #CXX="g++",
    #CXXFLAGS="-Wall -Werror",
    )

env.Program(target='debug', source=["foo.cc", "bar.cc"])
#################################

Compiling and running...

$ scons

g++-4.7 -o bar.o -c -Wall -Werror bar.cc
g++-4.7 -o foo.o -c -Wall -Werror foo.cc
g++-4.7 -o debug foo.o bar.o

$ ./debug 

*** glibc detected *** ./debug: free(): invalid pointer: 0xbff53b8c ***
======= Backtrace: =========
/lib/i686/cmov/libc.so.6(+0x6b381)[0xb7684381]
/lib/i686/cmov/libc.so.6(+0x6cbd8)[0xb7685bd8]
/lib/i686/cmov/libc.so.6(cfree+0x6d)[0xb7688cbd]
/usr/lib/libstdc++.so.6(_ZdlPv+0x1f)[0xb7856c5f]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb762fca6]
./debug[0x8048461]
======= Memory map: ========
08048000-08049000 r-xp 00000000 fd:10 7602195    /home/faheem/corrmodel/linker/debug
08049000-0804a000 rw-p 00000000 fd:10 7602195    /home/faheem/corrmodel/linker/debug
09ae0000-09b01000 rw-p 00000000 00:00 0          [heap]
b7617000-b7619000 rw-p 00000000 00:00 0 
b7619000-b7759000 r-xp 00000000 fd:00 1180005    /lib/i686/cmov/libc-2.11.3.so
b7759000-b775a000 ---p 00140000 fd:00 1180005    /lib/i686/cmov/libc-2.11.3.so
b775a000-b775c000 r--p 00140000 fd:00 1180005    /lib/i686/cmov/libc-2.11.3.so
b775c000-b775d000 rw-p 00142000 fd:00 1180005    /lib/i686/cmov/libc-2.11.3.so
b775d000-b7760000 rw-p 00000000 00:00 0 
b7760000-b777c000 r-xp 00000000 fd:00 4653173    /lib/libgcc_s.so.1
b777c000-b777d000 rw-p 0001c000 fd:00 4653173    /lib/libgcc_s.so.1
b777d000-b777e000 rw-p 00000000 00:00 0 
b777e000-b77a2000 r-xp 00000000 fd:00 1179967    /lib/i686/cmov/libm-2.11.3.so
b77a2000-b77a3000 r--p 00023000 fd:00 1179967    /lib/i686/cmov/libm-2.11.3.so
b77a3000-b77a4000 rw-p 00024000 fd:00 1179967    /lib/i686/cmov/libm-2.11.3.so
b77a4000-b7889000 r-xp 00000000 fd:00 2484736    /usr/lib/libstdc++.so.6.0.17
b7889000-b788d000 r--p 000e4000 fd:00 2484736    /usr/lib/libstdc++.so.6.0.17
b788d000-b788e000 rw-p 000e8000 fd:00 2484736    /usr/lib/libstdc++.so.6.0.17
b788e000-b7895000 rw-p 00000000 00:00 0 
b78ba000-b78bc000 rw-p 00000000 00:00 0 
b78bc000-b78bd000 r-xp 00000000 00:00 0          [vdso]
b78bd000-b78d8000 r-xp 00000000 fd:00 639026     /lib/ld-2.11.3.so
b78d8000-b78d9000 r--p 0001b000 fd:00 639026     /lib/ld-2.11.3.so
b78d9000-b78da000 rw-p 0001c000 fd:00 639026     /lib/ld-2.11.3.so
bff41000-bff56000 rw-p 00000000 00:00 0          [stack]
Aborted

Eww. This could have been avoided if the linker had warned that foo was being declared in two different ways. Even with -Wall it doesn't. So, is there a reason why it doesn't, and is there some flag that I can turn on to make it warn? Thanks in advance.

EDIT: Thanks for all the answers. The linker does issue a warning when there are conflicting function definitions, as opposed to a conflicting function definition and declaration as in my example above. I don't understand the reason for this different behavior.

like image 711
Faheem Mitha Avatar asked Feb 21 '23 16:02

Faheem Mitha


2 Answers

The C++ linker only identifies functions as far as it needs to for unique identification.

This is from the following in-depth article on the C++ linker.

...the names of the symbols are decorated with additional strings. This is called name mangling.

The decoration before the identifier name is needed because C++ supports namespaces. For example the same function name can occur multiple times in different namespaces while denoting a different entity each time. To enable the linker to differentiate between those entities the name of each identifier is prepended with tokens representing its enclosing namespaces.

The decoration after the identifier name is needed because C++ allows function overloading. Again the same function name can denote different identifiers, which differ only in their parameter list. To enable the linker to differentiate between those, tokens representing the parameter list are appended to the name of the identifier. The return type of a function is disregarded, because two overloaded functions must not differ only in their return type.

So the point is that the name mangling applied to functions disregards return type as overloaded functions cannot differ by return type. As such the linker is unable to spot the problem.

like image 151
Tim Gee Avatar answered Mar 05 '23 17:03

Tim Gee


The linker just acts on the names that the compiler says are defined in modules are or are referenced (needed) by modules. GCC apparently uses the "Itanium C++ ABI" for mangling function names (starting with GCC 3). For most functions, the return type isn't incorporated into the mangled name, so that's why the linker doesn't take it into account:

Itanium C++ ABI

Function types are composed from their parameter types and possibly the result type. Except at the outer level type of an , or in the of an otherwise delimited external name in a or function encoding, these types are delimited by an "F..E" pair. For purposes of substitution (see Compression below), delimited and undelimited function types are considered the same.

Whether the mangling of a function type includes the return type depends on the context and the nature of the function. The rules for deciding whether the return type is included are:

  • Template functions (names or types) have return types encoded, with the exceptions listed below.
  • Function types not appearing as part of a function name mangling, e.g. parameters, pointer types, etc., have return type encoded, with the exceptions listed below.
  • Non-template function names do not have return types encoded.

The exceptions mentioned in (1) and (2) above, for which the return type is never included, are

  • Constructors.
  • Destructors.
  • Conversion operator functions, e.g. operator int

In general in C++ the return type of a function isn't considered when the compiler performs name lookup (for example for overload resolution). This might be part of the reason why the return type isn't usually included in the name mangling. I don't know if there's a stronger reason for not incorporating the return type into the mangled name.

like image 27
Michael Burr Avatar answered Mar 05 '23 18:03

Michael Burr