Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the linker actually do when linking a pointer variable to an array in C++?

Tags:

c++

I have searched a lot for the extern-array-pointer problem but still feeling confused.

In the following code:

// example 1
//1.cpp
int a[]={1,2,3};  //the array a

//main.cpp
extern int*a;  //the pointer a

In main.cpp, when I use printf to print a, it gives me 1 which is the first four bytes of the array a defined in 1.cpp. And printing &a gives me 0x1234(for example) which is the address of the first element of the array a defined in 1.cpp.

It acts like the pointer a was connected with the array a by the address 0x1234 forcibly. Thus, the value of the pointer a is what located at 0x1234, which is 1, since sizeof(int*) == sizeof(int) in 32-bit.

I have learned that the linker needs the unresolved symbol table and the export symbol table to link declaration to definition.

While compiling 1.cpp, symbol a was added to the export symbole table and while compiling main.cpp, symbol a was added to the unresolved symbol table. They should be named differently since their type is not the same.

In fact the linker could check the types of variable, because:

//example 2
//1.cpp
int a[]={1,2,3}

//2.cpp
extern char *a;

throwing a linking error that char *a was unresolved, but they don't mixed forcibly, linker could catch the error.
In single unit:

//example 3
int a[] = {1,2,3};
int *ptr = a; 

the compiler convert the a variable to a temporary int * implicitly, but can not do that while in different units.

So why extern a pointer to receive an array is not caught by the linker. What does the linker actually do?

Thank you so much!

like image 502
sakugawa Avatar asked Dec 06 '25 08:12

sakugawa


1 Answers

The C/C++ part can be dealt with immediately: with many implementations, C doesn’t mangle any symbols, since they’re all supposed to be unique, and lacking overloading for variables C++ doesn’t mangle them either. (Variable templates are mangled, as are static data members.) This isn’t a requirement of the language: formally, you have to use extern "C" for variables, but the standard allows collisions with unannotated global variables, and this happens frequently in practice (and is now a point of backward compatibility). The rest is identical for C and C++.

What typical linkers manage are the addresses of every variable, with no type information except that implicit in the mangled names. The address of an array is that of its first element, so your “pointer” ends up being an alias for that element (with the wrong type). (Since this is, as pointed out in the comments, ill-formed, other hilarity can ensue like stores through the pointer not being visible through the array (as accessed via some other pointer).) A different linker implementation could be more helpful, but again backward compatibility forbids it.

like image 176
Davis Herring Avatar answered Dec 07 '25 22:12

Davis Herring