C
Say I have the following C modules:
MODULE 1
#include <stdio.h>
int x;
int main(){
foo();
printf("%i\n",x);
return 0;
}
MODULE 2
double x;
void foo(){
x = 3.14;
}
My question is: what does the linker do in this case? In the textbook I'm reading it says the compiler chooses only one of two weak global variables for the linker symbol table. Which of these two is chosen? Or are both chosen? If so, why? Thanks.
There is no problem if it is inside a function as it is a local variable. If it is a global declaration, it depends on the linker. You can get a link error (or if you use options to direct it to do so). Otherwise they will be consolidated into one location as if you had declared them external.
A global variable is accessible to all functions in every source file where it is declared. To avoid problems: Initialization — if a global variable is declared in more than one source file in a library, it should be initialized in only one place or you will get a compiler error.
The clean, reliable way to declare and define global variables is to use a header file to contain an extern declaration of the variable. The header is included by the one source file that defines the variable and by all the source files that reference the variable.
In C, a definition of a global variable can be used for a declaration multiple times. But if the program only has extern int x; , which is a declaration, the compile will abort since there is no place where memory is allocated to the variable.
C says it is undefined behavior.
(C99, 6.9p5) "If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one"
Being undefined behavior means a linker can abort the linking process in presence of multiple external object definitions.
Now linkers are nice (or evil, you can choose) and usually have default extensions to handle multiple external object definitions and not fail in some cases.
If you are using gcc
and ld
from binutils, you'll get an error if your two object are explicitly initialized. For example, you have int x = 0;
in the first translation unit and double x = 0.0;
.
Otherwise, if one of the external object is not explicitly initialized (the situation in your example) gcc
will silently combine the two objects into one symbol. You can still ask the linker to report a warning by passing it the option --warn-common
.
For example when linking the modules:
gcc -Wl,--warn-common module1.o module2.o
To get the linking process aborted, you can request the linker to treat all warnings as errors using --fatal-warnings
option (-Wl,--fatal-warnings,--warn-common
).
Another way to get the linking process aborted is to use -fno-common
compiler option, as explained by @teppic in his answer. -fno-common
forbids the external objects to get a Common symbol type at compilation. If you do it for both module and then link, you'll also get the multiple definition linker error.
gcc -Wall -fno-common -c module1.c module2.c
gcc module1.o module2.o
If the implementation supports multiple external definitions, you'll end up with one object that's effectively cast to each type in each module, as in some kind of implicit union variable. The amount of memory for the larger type will be allocated, and both will behave as external declarations.
If you compile using clang or gcc, use the option -fno-common
to cause an error for this.
Here's the section from the gcc manual:
In C code, controls the placement of uninitialized global
variables. Unix C compilers have traditionally permitted multiple
definitions of such variables in different compilation units by
placing the variables in a common block. This is the behavior
specified by -fcommon, and is the default for GCC on most targets.
On the other hand, this behavior is not required by ISO C, and on
some targets may carry a speed or code size penalty on variable
references. The -fno-common option specifies that the compiler
should place uninitialized global variables in the data section of
the object file, rather than generating them as common blocks.
This has the effect that if the same variable is declared (without
"extern") in two different compilations, you will get a multiple-
definition error when you link them.
This option effectively enforces strict ISO C compliance with respect to multiple definitions.
This behaviour is generally accepted for external variables of the same type. As the GCC manual states, most compilers support this, and (providing the types are the same), the C99 standard defines its use as an extension.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With