a.h
void addr(void);
a.c
#include <stdio.h>
int x;
void addr(void) {
printf("a:x=%p\n", &x);
}
b.c
#include <stdio.h>
#include "a.h"
char x;
int main(void) {
addr(); /* a:x=0x601044 */
printf("b:x=%p\n", &x); /* b:x=0x601044 */
return 0;
}
Why the compiler or linker is not complaining about two extern declarations with different type and same identifier (x), and they are silently linked together?
Environment:
$ gcc --version
gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
$ gcc -o test -Wall -std=c11 a.c b.c
In Python variables have function-wide scope which means that if two variables have the same name in the same scope, they are in fact one variable. Consequently, nested loops in which the target variables have the same name in fact share a single variable.
DEFINITION: An Identifier is any Variable or set of Variables (also called Complex identifier) which are structurally unique for every population unit, for example a population registration number.
The declaration int x;
in a.c
and char x;
in b.c
are only tentative definitions of identifier x
.
The C11 standard draft N1570 states:
6.9.2 External object definitions
...
2 A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition.
If instead you initialize x
in both the files (something like int x = 2;
in a.c
and char x = '1';
in b.c
, they become "complete" definitions and then you will have multiple definition error from linker.
Something like:
Error LNK1169 one or more multiply defined symbols found
Error LNK2005 x already defined in a.obj
The C standard does not define the behavior of defining an identifier with external linkage twice. Some behavior is commonly defined as an extension to C, notably on Unix systems. However, this extension relies on the definitions having compatible types; the result of defining int x;
and char x;
is generally not defined.
Defining an identifier with external linkage twice violates a constraint in the C standard, in C 2018 6.9 5 (bold added):
If an identifier declared with external linkage is used in an expression (other than as part of the operand of a
sizeof
or_Alignof
operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
In your program, x
is used in the expression &x
, so the above constraint applies: There must be exactly one external definition for x
. When a constraint is violated, the resulting behavior is not defined by the C standard, per C 2018 4 2.
Why then does int x;
and char x;
behave differently from int x = 0;
and char x = 0;
? One might think they should be the same because the former are tentative definitions (because they have no storage-class specifier or initializer) and C 2018 6.9.2 2 says:
If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
There are two reasons. The first is the rule about violating a constraint resulting in behavior not defined by the C standard is an overriding rule; it takes priority over the rule about tentative definitions.
The second is that, although the C standard does not define the behavior, other documents may define it. As noted in C 2018 J.5.11 (which is an informative section rather than a normative part of the standard), a common extension to the C language is to permit multiple external definitions. Generally, the types of the definitions should agree, and only one should be initialized.
For example, the Systems V Application Binary Interface
describes how multiple definitions may be reconciled in cases where there are mixed strong and weak definitions or there are mixed common and non-common definitions. The compiler cooperates with this extension to C by producing an object file that marks identifiers differently according to whether they have regular definitions or just tentative definitions. For example, compiling a file containing char x;
with Apple LLVM 10.0.0 and clang-1000.11.45.5 for x86_64 produces a symbol x
marked for the common section, but compiling a file containing int x = 0;
produces a symbol x
marked for a general section. (When the nm
command is applied to the object file produced by the compiler, it shows C
and S
for these sections, respectively.)
The result is:
x
twice is not defined by the C standard.x
along with at most one regular definition.x
with int
in one place and char
in another place is improper but is not diagnosed by the linker.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With