Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different linkage for extern "C" vs. extern "C" { } in C++ [duplicate]

Tags:

c++

I realize that, at first sight, my question might seem an obvious duplicate of one of the many questions here related with the extern keyword, but I was unable to find any answer talking about the difference between extern "C" and extern "C" { }. On the contrary, I've found several people stating that the two constructs are equivalent, as I believe it is reasonable to expect. Unfortunately, empirical evidence shows that they really are not equivalent.

Here is an example:

extern "C" { const int my_var1 = 21; }
extern "C" const int my_var2 = 42;
const int my_var3 = 121;

int main() { }

After compiling it with gcc 7, with g++ externC.cpp, I see a remarkable difference:

$ readelf -s ./a.out | grep my_var
    34: 0000000000000694     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var1
    35: 000000000000069c     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var3
    59: 0000000000000698     4 OBJECT  GLOBAL DEFAULT   15 my_var2

my_var1 and my_var3 both have local binding and a C++ mangled name, while my_var2 has global binding and actual C linkage. So, it looks like the extern "C" { } has been completely ignored, while the similar extern "C" without {} did have effect. That is super weird to me.

Things get even more interesting if I remove the const and just try to read the variables:

#include <cstdio>

extern "C" { int my_var1; }
extern "C" int my_var2;
int my_var3;

int main() {
    printf("%d, %d, %d\n", my_var1, my_var2, my_var3);
}

When I try to compile this 2nd program, the linker complains that it has been unable to find a reference for my_var2:

/tmp/ccfs9cis.o: In function `main':
externC.cpp:(.text+0xc): undefined reference to `my_var2'
collect2: error: ld returned 1 exit status

And that means that in this case two things happened:

  1. extern "C" { int my_var1; } instantiated in the translation unit a variable called my_var1 with C linkage.

  2. extern "C" int my_var2; declared an extern variable, where with extern I mean in the traditional sense (like extern int x;), but with "C" linkage.

Which, from, my point of view, is inconsistent with the behavior in the 1st case above, using const. In other words:

  • In the 1st program with const

    • extern "C" behaved like I expected extern "C" {} to behave [change the linkage]

    • extern "C" {} instead, did nothing

  • In the 2nd program, without const:

    • extern "C" {} behaved like I originally expected [change the linkage] BUT

    • extern "C" behaved like: extern "C" { extern int my_var2; } which is the way to declare an extern variable with C linkage (and unfortunately in C++ the keyword extern has been reused).

In conclusion, my question is: can anyone (maybe a compiler expert?) explain the theory behind the reason for extern "C" and extern "C" {} to behave so differently and in such a inconsistent (at least for me) way ? In mine experience with C++, I realized that once you understand in deep details a given concept, even its tricky and complex corner cases start to look pretty reasonable and consistent. Just, you need to see the whole picture very clearly. I believe that is such a case.

Thanks a lot to everybody, in advance.


Edit[1]

[At the end it turned that a similar question did exist here, just I was unable to find it. Sorry for that.]

Thanks to the answers so far, I understand now the subtle difference between extern "C" {} and extern "C", even if I'd still be curious to understand how we (the C++ developers/ISO committee) ended up with such a solution. It's kind-of like making if (x) foo(); to be behave slightly differently than if (x) { foo(); }. Anyway, given this new knowledge, I'd have a few (hopefully) interesting observations to make:

Given that the transformation: extern "C" X => extern "C" { extern X } is always correct

It follows that:

  • The only way to define (instantiate) a const variable with C linkage in the current translation unit is to make it extern, even if we want don't want that: the compiler will decide if we're instantiating or just declaring an extern depending on if we initialized the variable with a value: in that case, we're defining, otherwise we're just declaring.

  • The same logic (extern + const) applies to regular const variables with C++ linkage as well. A const variable with C linkage is no different except for the lack of name mangling.

  • From the statements above it follows that, since const implies internal linkage in C++ (but not in C!), the extern when used for a const does not mean extern, but just less internal or more extern than static.

In other words:

  • const int var = 23; creates a global variable with internal linkage, like static int var = 23; would except for being placed in a read-only segment.
  • extern const int var = 23; creates a global variable with regular (external) linkage. The extern neutralizes the implicit static. The result is the same as int var = 23 except that with const it will be placed in a read-only segment.
  • extern const int var; declares a proper extern variable in a foreign read-only segment.
like image 561
vvaltchev Avatar asked Dec 30 '18 18:12

vvaltchev


People also ask

What is extern C linkage?

extern "C" specifies that the function is defined elsewhere and uses the C-language calling convention. The extern "C" modifier may also be applied to multiple function declarations in a block. In a template declaration, extern specifies that the template has already been instantiated elsewhere.

What is true about external linkage in C?

External Linkage: An identifier implementing external linkage is visible to every translation unit. Externally linked identifiers are shared between translation units and are considered to be located at the outermost level of the program.

What is a linkage variable in C?

In programming languages, particularly the compiled ones like C, C++, and D, linkage describes how names can or can not refer to the same entity throughout the whole program or one single translation unit. The static keyword is used in C to restrict the visibility of a function or variable to its translation unit.

What is the difference between global and extern variable in C?

These variables are defined outside the function. These variables are available globally throughout the function execution. The value of global variables can be modified by the functions. “extern” keyword is used to declare and define the external variables.


1 Answers

See here:

[extern "C" { ... }] Applies the language specification string-literal to all function types, function names with external linkage and variables with external linkage declared in declaration-seq.

Since const int my_var1 = 21; has internal linkage, wrapping extern "C" { } around it has no effect.

Also:

[extern "C" ...] Applies the language specification string-literal to a single declaration or definition.

and

A declaration directly contained in a language linkage specification is treated as if it contains the extern specifier for the purpose of determining the linkage of the declared name and whether it is a definition.

extern "C" int x; // a declaration and not a definition
// The above line is equivalent to extern "C" { extern int x; }

extern "C" { int x; } // a declaration and definition

This explains why for extern "C" const int my_var2 = 42; the variable has external linkage and an unmangled name. It also explains why you're seeing an undefined reference to my_var2 in your second code example.

like image 159
Kevin Avatar answered Oct 26 '22 19:10

Kevin