Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are references implemented internally?

I'm just wondering how references are actually implemented across different compilers and debug/release configurations. Does the standard provide recommendations on their implementation? Do implementations differ?

I tried to run a simple program where I return non-const references and pointers to local variables from functions, but they worked out the same way. Does this mean that references are internally just a pointer?

like image 523
Keynslug Avatar asked Oct 17 '10 19:10

Keynslug


People also ask

How is reference pass implemented?

Objects are referred to by pointing at them with pointers. The value of a pointer is the address of the pointed object. Pointers are passed by value just like any other arguments. A pointed object is conceptually passed by reference.

How is a reference implemented in C++?

Like a pointer, a reference is an alias for an object, is usually implemented to hold a machine address of an object, and does not impose performance overhead compared to pointers, but it differs from a pointer in that: You access a reference with exactly the same syntax as the name of an object.

When you must use a reference instead of a pointer?

References are usually preferred over pointers whenever you don't need “reseating”. This usually means that references are most useful in a class's public interface. References typically appear on the skin of an object, and pointers on the inside.

Can reference be assigned in C++?

As for your second question, references cannot be reassigned once bound to an object. If you need to have a reference that can change its referent, you should be using a pointer instead. Hope this helps!


2 Answers

Just to repeat some of the stuff everyone's been saying, lets look at some compiler output:

#include <stdio.h> #include <stdlib.h>  int byref(int & foo) {   printf("%d\n", foo); } int byptr(int * foo) {   printf("%d\n", *foo); }  int main(int argc, char **argv) {   int aFoo = 5;    byref(aFoo);   byptr(&aFoo); } 

We can compile this with LLVM (with optimizations turned off) and we get the following:

define i32 @_Z5byrefRi(i32* %foo) { entry:   %foo_addr = alloca i32*                         ; <i32**> [#uses=2]   %retval = alloca i32                            ; <i32*> [#uses=1]   %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]   store i32* %foo, i32** %foo_addr   %0 = load i32** %foo_addr, align 8              ; <i32*> [#uses=1]   %1 = load i32* %0, align 4                      ; <i32> [#uses=1]   %2 = call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 %1) ; <i32> [#uses=0]   br label %return  return:                                           ; preds = %entry   %retval1 = load i32* %retval                    ; <i32> [#uses=1]   ret i32 %retval1 }  define i32 @_Z5byptrPi(i32* %foo) { entry:   %foo_addr = alloca i32*                         ; <i32**> [#uses=2]   %retval = alloca i32                            ; <i32*> [#uses=1]   %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]   store i32* %foo, i32** %foo_addr   %0 = load i32** %foo_addr, align 8              ; <i32*> [#uses=1]   %1 = load i32* %0, align 4                      ; <i32> [#uses=1]   %2 = call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 %1) ; <i32> [#uses=0]   br label %return  return:                                           ; preds = %entry   %retval1 = load i32* %retval                    ; <i32> [#uses=1]   ret i32 %retval1 } 

The bodies of both functions are identical

like image 165
SingleNegationElimination Avatar answered Sep 23 '22 01:09

SingleNegationElimination


Sorry for using assembly to explain this but I think this is the best way to understand how references are implemented by compilers.

    #include <iostream>      using namespace std;      int main()     {         int i = 10;         int *ptrToI = &i;         int &refToI = i;          cout << "i = " << i << "\n";         cout << "&i = " << &i << "\n";          cout << "ptrToI = " << ptrToI << "\n";         cout << "*ptrToI = " << *ptrToI << "\n";         cout << "&ptrToI = " << &ptrToI << "\n";          cout << "refToNum = " << refToI << "\n";         //cout << "*refToNum = " << *refToI << "\n";         cout << "&refToNum = " << &refToI << "\n";          return 0;     } 

Output of this code is like this

    i = 10     &i = 0xbf9e52f8     ptrToI = 0xbf9e52f8     *ptrToI = 10     &ptrToI = 0xbf9e52f4     refToNum = 10     &refToNum = 0xbf9e52f8 

Lets look at the disassembly(I used GDB for this. 8,9 and 10 here are line numbers of code)

8           int i = 10; 0x08048698 <main()+18>: movl   $0xa,-0x10(%ebp) 

Here $0xa is the 10(decimal) that we are assigning to i. -0x10(%ebp) here means content of ebp register –16(decimal). -0x10(%ebp) points to the address of i on stack.

9           int *ptrToI = &i; 0x0804869f <main()+25>: lea    -0x10(%ebp),%eax 0x080486a2 <main()+28>: mov    %eax,-0x14(%ebp) 

Assign address of i to ptrToI. ptrToI is again on stack located at address -0x14(%ebp), that is ebp – 20(decimal).

10          int &refToI = i; 0x080486a5 <main()+31>: lea    -0x10(%ebp),%eax 0x080486a8 <main()+34>: mov    %eax,-0xc(%ebp) 

Now here is the catch! Compare disassembly of line 9 and 10 and you will observer that ,-0x14(%ebp) is replaced by -0xc(%ebp) in line number 10. -0xc(%ebp) is the address of refToNum. It is allocated on stack. But you will never be able to get this address from you code because you are not required to know the address.

So; a reference does occupy memory. In this case it is the stack memory since we have allocated it as a local variable. How much memory does it occupy? As much a pointer occupies.

Now lets see how we access the reference and pointers. For simplicity I have shown only part of the assembly snippet

16          cout << "*ptrToI = " << *ptrToI << "\n"; 0x08048746 <main()+192>:        mov    -0x14(%ebp),%eax 0x08048749 <main()+195>:        mov    (%eax),%ebx 19          cout << "refToNum = " << refToI << "\n"; 0x080487b0 <main()+298>:        mov    -0xc(%ebp),%eax 0x080487b3 <main()+301>:        mov    (%eax),%ebx 

Now compare the above two lines, you will see striking similarity. -0xc(%ebp) is the actual address of refToI which is never accessible to you. In simple terms, if you think of reference as a normal pointer, then accessing a reference is like fetching the value at address pointed to by the reference. Which means the below two lines of code will give you the same result

cout << "Value if i = " << *ptrToI << "\n"; cout << " Value if i = " << refToI << "\n"; 

Now compare this

15          cout << "ptrToI = " << ptrToI << "\n"; 0x08048713 <main()+141>:        mov    -0x14(%ebp),%ebx 21          cout << "&refToNum = " << &refToI << "\n"; 0x080487fb <main()+373>:        mov    -0xc(%ebp),%eax 

I guess you are able to spot what is happening here. If you ask for &refToI, the contents of -0xc(%ebp) address location are returned and -0xc(%ebp) is where refToi resides and its contents are nothing but address of i.

One last thing, Why is this line commented?

//cout << "*refToNum = " << *refToI << "\n"; 

Because *refToI is not permitted and it will give you a compile time error.

like image 45
Prasad Rane Avatar answered Sep 19 '22 01:09

Prasad Rane