I realize that, at first sight, my question might seem an obvious duplicate of one of the many questions here related with the extern
keyword, but I was unable to find any answer talking about the difference between extern "C" and extern "C" { }. On the contrary, I've found several people stating that the two constructs are equivalent, as I believe it is reasonable to expect. Unfortunately, empirical evidence shows that they really are not equivalent.
Here is an example:
extern "C" { const int my_var1 = 21; }
extern "C" const int my_var2 = 42;
const int my_var3 = 121;
int main() { }
After compiling it with gcc 7, with g++ externC.cpp
, I see a remarkable difference:
$ readelf -s ./a.out | grep my_var
34: 0000000000000694 4 OBJECT LOCAL DEFAULT 15 _ZL7my_var1
35: 000000000000069c 4 OBJECT LOCAL DEFAULT 15 _ZL7my_var3
59: 0000000000000698 4 OBJECT GLOBAL DEFAULT 15 my_var2
my_var1
and my_var3
both have local binding and a C++ mangled name, while my_var2
has global binding and actual C linkage. So, it looks like the extern "C" { }
has been completely ignored, while the similar extern "C"
without {}
did have effect. That is super weird to me.
Things get even more interesting if I remove the const
and just try to read the variables:
#include <cstdio>
extern "C" { int my_var1; }
extern "C" int my_var2;
int my_var3;
int main() {
printf("%d, %d, %d\n", my_var1, my_var2, my_var3);
}
When I try to compile this 2nd program, the linker complains that it has been unable to find a reference for my_var2
:
/tmp/ccfs9cis.o: In function `main':
externC.cpp:(.text+0xc): undefined reference to `my_var2'
collect2: error: ld returned 1 exit status
And that means that in this case two things happened:
extern "C" { int my_var1; }
instantiated in the translation
unit a variable called my_var1
with C linkage.
extern "C" int my_var2;
declared an extern variable, where
with extern
I mean in the traditional sense
(like extern int x;
), but with "C" linkage.
Which, from, my point of view, is inconsistent with the behavior in the 1st case above, using const
. In other words:
In the 1st program with const
extern "C"
behaved like I expected extern "C" {}
to behave
[change the linkage]
extern "C" {}
instead, did nothing
In the 2nd program, without const:
extern "C" {}
behaved like I originally expected [change the linkage] BUT
extern "C"
behaved like:
extern "C" { extern int my_var2; }
which is the way to declare an extern variable with C
linkage (and unfortunately in C++ the keyword extern
has
been reused).
In conclusion, my question is: can anyone (maybe a compiler expert?) explain the theory behind the reason for extern "C"
and extern "C" {}
to behave so differently and in such a inconsistent (at least for me) way ? In mine experience with C++, I realized that once you understand in deep details a given concept, even its tricky and complex corner cases start to look pretty reasonable and consistent. Just, you need to see the whole picture very clearly. I believe that is such a case.
Thanks a lot to everybody, in advance.
Edit[1]
[At the end it turned that a similar question did exist here, just I was unable to find it. Sorry for that.]
Thanks to the answers so far, I understand now the subtle difference between extern "C" {}
and extern "C"
, even if I'd still be curious to understand how we (the C++ developers/ISO committee) ended up with such a solution. It's kind-of like making if (x) foo();
to be behave slightly differently than if (x) { foo(); }
. Anyway, given this new knowledge, I'd have a few (hopefully) interesting observations to make:
Given that the transformation:
extern "C" X
=> extern "C" { extern X }
is always correct
It follows that:
The only way to define (instantiate) a const
variable with C linkage
in the current translation unit is to make it extern
, even if we want don't want that: the compiler will decide if we're instantiating or just declaring an extern depending on if we initialized the variable with a value: in that case, we're defining, otherwise we're just declaring.
The same logic (extern + const) applies to regular const
variables with C++ linkage as well. A const
variable with C linkage is no different except for the lack of name mangling.
From the statements above it follows that, since const
implies internal linkage in C++ (but not in C!), the extern
when used for a const
does not mean extern
, but just less internal or more extern than static.
In other words:
const int var = 23;
creates a global variable with internal linkage, like static int var = 23;
would except for being placed in a read-only segment.extern const int var = 23;
creates a global variable with regular (external) linkage. The extern
neutralizes the implicit static
. The result is the same as int var = 23
except that with const
it will be placed in a read-only segment.extern const int var;
declares a proper extern variable in a foreign read-only segment.extern "C" specifies that the function is defined elsewhere and uses the C-language calling convention. The extern "C" modifier may also be applied to multiple function declarations in a block. In a template declaration, extern specifies that the template has already been instantiated elsewhere.
External Linkage: An identifier implementing external linkage is visible to every translation unit. Externally linked identifiers are shared between translation units and are considered to be located at the outermost level of the program.
In programming languages, particularly the compiled ones like C, C++, and D, linkage describes how names can or can not refer to the same entity throughout the whole program or one single translation unit. The static keyword is used in C to restrict the visibility of a function or variable to its translation unit.
These variables are defined outside the function. These variables are available globally throughout the function execution. The value of global variables can be modified by the functions. “extern” keyword is used to declare and define the external variables.
See here:
[
extern "C" { ... }
] Applies the language specification string-literal to all function types, function names with external linkage and variables with external linkage declared in declaration-seq.
Since const int my_var1 = 21;
has internal linkage, wrapping extern "C" { }
around it has no effect.
Also:
[
extern "C" ...
] Applies the language specification string-literal to a single declaration or definition.
and
A declaration directly contained in a language linkage specification is treated as if it contains the extern specifier for the purpose of determining the linkage of the declared name and whether it is a definition.
extern "C" int x; // a declaration and not a definition
// The above line is equivalent to extern "C" { extern int x; }
extern "C" { int x; } // a declaration and definition
This explains why for extern "C" const int my_var2 = 42;
the variable has external linkage and an unmangled name. It also explains why you're seeing an undefined reference to my_var2
in your second code example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With