When we create a member function for a class in c++, it has an implicit extra argument that is a pointer to the calling object -- referred as this
.
Is this true for any function, even if it does not use this
pointer. For example, given the class
class foo
{
private:
int bar;
public:
int get_one()
{
return 1; // Not using `this`
}
int get_bar()
{
return this->bar; // Using `this`
}
}
Would both the functions (get_one
and get_bar
) take this
as an implicit parameter, even though only one of them actually uses it?
It seems like a bit of a waste to do so.
Note: I understand the correct thing to do would be to make get_one()
static, and that the answer may be dependent on the implementation, but I'm just curious.
The this pointer is an implicit parameter to all member functions. Therefore, inside a member function, this may be used to refer to the invoking object. Friend functions do not have a this pointer, because friends are not members of a class. Only member functions have a this pointer.
Private Member function in C++ We can place these functions in the private section. A private member function can only be called by another function that is a member of its class. Even an object can not invoke a private function using the dot operator.
Private: The class members declared as private can be accessed only by the functions inside the class. They are not allowed to be accessed directly by any object or function outside the class. Only the member functions or the friend functions are allowed to access the private data members of a class.
Member functions are operators and functions that are declared as members of a class. Member functions do not include operators and functions declared with the friend specifier. These are called friends of a class. You can declare a member function as static ; this is called a static member function.
Would both of the functions (get_one and get_bar) take this as an implicit parameter even though only onle get_bar uses it?
Yes (unless the compiler optimizes it away, which still doesn't mean you can call the function without a valid object).
It seems like a bit of a waste to do so
Then why is it a member if it doesn't use any member data? Sometimes, the correct approach is making it a free function in the same namespace.
...class in c++, as I understand it, it has an implicit extra argument that is a pointer to the calling object
It's important to note that C++ started as C with objects.
To that, the this
pointer isn't one that is implicitly present within a member function, but instead the member function, when compiled out, needs a way to know what this
is referring to; thus the notion of an implicit this
pointer to the calling object being passed in.
To put it another way, lets take your C++ class and make it a C version:
class foo
{
private:
int bar;
public:
int get_one()
{
return 1;
}
int get_bar()
{
return this->bar;
}
int get_foo(int i)
{
return this->bar + i;
}
};
int main(int argc, char** argv)
{
foo f;
printf("%d\n", f.get_one());
printf("%d\n", f.get_bar());
printf("%d\n", f.get_foo(10));
return 0;
}
typedef struct foo
{
int bar;
} foo;
int foo_get_one(foo *this)
{
return 1;
}
int foo_get_bar(foo *this)
{
return this->bar;
}
int foo_get_foo(int i, foo *this)
{
return this->bar + i;
}
int main(int argc, char** argv)
{
foo f;
printf("%d\n", foo_get_one(&f));
printf("%d\n", foo_get_bar(&f));
printf("%d\n", foo_get_foo(10, &f));
return 0;
}
When the C++ program is compiled and assembled, the this
pointer is "added" to the mangled function in order to "know" what object is calling the member function.
So foo::get_one
might be "mangled" to the C equivalent of foo_get_one(foo *this)
, foo::get_bar
could be mangled to foo_get_bar(foo *this)
and foo::get_foo(int)
could be foo_get_foo(int, foo *this)
, etc.
Would both of the functions (
get_one
andget_bar
) take this as an implicit parameter even though only oneget_bar
uses it? It seems like a bit of a waste to do so.
This is a function of the compiler and if absolutely no optimizations were done, the heuristics might still eliminate the this
pointer in a mangled function where an object need not be called (to save stack), but that is highly dependent on the code and how it's being compiled and to what system.
More specifically, if the function were one as simple as foo::get_one
(merely returning a 1
), chances are the compiler might just put the constant 1
in place of the call to object->get_one()
, eliminating the need for any references/pointers.
Hope that can help.
Semantically the this
pointer is always available in a member function - as another user pointed out. That is, you could could later change the function to use it without issue (and, in particular, without the need to recompile calling code in other translation units) or in the case of a virtual
function, an overridden version in a subclass could use this
even if the base implementation didn't.
So the remaining interesting question is what performance impact this imposes, if any. There may be a cost to the caller and/or the callee and the cost may be different when inlined and not inlined. We examine all the permutations below:
In the inlined case, the compiler can see both the call site and the function implementation1, and so presumably doesn't need to follow any particular calling convention and so cost of the hidden this
pointer should go away. Note also that in this case there is no real distinction between the "callee" code and the "called" code, since they are combined at optimized together at the call site.
Let's use the following test code:
#include <stdio.h>
class foo
{
private:
int bar;
public:
int get_one_member()
{
return 1; // Not using `this`
}
};
int get_one_global() {
return 2;
}
int main(int argc, char **) {
foo f = foo();
if(argc) {
puts("a");
return f.get_one_member();
} else {
puts("b");
return get_one_global();
}
}
Note that the two puts
calls are just there to make the branches a bit more different - otherwise the compilers are smart enough to just use a conditional set/move, and so you can't even really separate the inlined bodies of the two functions.
All of gcc, icc and clang inline the two calls and generate code that is equivalent for both the member and non-member function, without any trace of the this
pointer in the member case. Let's look at the clang
code since it's the cleanest:
main:
push rax
test edi,edi
je 400556 <main+0x16>
# this is the member case
mov edi,0x4005f4
call 400400 <puts@plt>
mov eax,0x1
pop rcx
ret
# this is the non-member case
mov edi,0x4005f6
call 400400 <puts@plt>
mov eax,0x2
pop rcx
ret
Both paths generate the exact same series of 4 instructions leading up to the final ret
- two instructions for the puts
call, a single instruction to mov
the return value of 1
or 2
into eax
, and a pop rcx
to clean up the stack2. So the actual call took exactly one instruction in either case, and there was no this
pointer manipulation or passing at all.
In the out-of-line costs, supporting the this
pointer will actually have some real-but-generally-small costs, at least on the caller side.
We use a similar test program, but with the member functions declared out-of-line and with inlining of those functions disabled3:
class foo
{
private:
int bar;
public:
int __attribute__ ((noinline)) get_one_member();
};
int foo::get_one_member()
{
return 1; // Not using `this`
}
int __attribute__ ((noinline)) get_one_global() {
return 2;
}
int main(int argc, char **) {
foo f = foo();
return argc ? f.get_one_member() :get_one_global();
}
This test code is somewhat simpler than the last one because it doesn't need the puts
call to distinguish the two branches.
Let's look at the assembly that gcc
4generates for main
(i.e., at the call sites for the functions):
main:
test edi,edi
jne 400409 <main+0x9>
# the global branch
jmp 400530 <get_one_global()>
# the member branch
lea rdi,[rsp-0x18]
jmp 400520 <foo::get_one_member()>
nop WORD PTR cs:[rax+rax*1+0x0]
nop DWORD PTR [rax]
Here, both function calls are actually realized using jmp
- this is a type of tail-call optimization since they are the last functions called in main, so the ret
for the called function actually returns to the caller of main
- but here the caller of the member function pays an extra price:
lea rdi,[rsp-0x18]
That's loading the this
pointer onto the stack into rdi
which receives the first argument which is this
for C++ member functions. So there is a (small) extra cost.
Now while the call-site pays some cost to pass an (unused) this
pointer, in this case at least, the actual function bodies are still equally efficient:
foo::get_one_member():
mov eax,0x1
ret
get_one_global():
mov eax,0x2
ret
Both are composed of a single mov
and a ret
. So the function itself can simply ignore the this
value since it isn't used.
This raises the question of whether this is true in general - will the function body of a member function that doesn't use this
always be compiled as efficiently as an equivalent non-member function?
The short answer is no - at least for most modern ABIs that pass arguments in registers. The this
pointer takes up a parameter register in the calling convention, so you'll hit the maximum number of register-passed arguments one parameter sooner when compiling a member function.
Take for example this function that simply adds its six int
parameters together:
int add6(int a, int b, int c, int d, int e, int f) {
return a + b + c + d + e + f;
}
When compiled as a member function on an x86-64 platform using the SysV ABI, you'll have to pass on register on the stack for the member function, resulting in code like this:
foo::add6_member(int, int, int, int, int, int):
add esi,edx
mov eax,DWORD PTR [rsp+0x8]
add ecx,esi
add ecx,r8d
add ecx,r9d
add eax,ecx
ret
Note the read from the stack eax,DWORD PTR [rsp+0x8]
which will generally add a few cycles of latency5 and one instruction on gcc6 versus the non-member version, which has no memory reads:
add6_nonmember(int, int, int, int, int, int):
add edi,esi
add edx,edi
add ecx,edx
add ecx,r8d
lea eax,[rcx+r9*1]
ret
Now you won't usually have six or more arguments to a function (especially very short, performance sensitive ones) - but this at least shows that even on the callee code-generation side, this hidden this
pointer isn't always free.
Note also that while the examples used x86-64 codegen and the SysV ABI, the same basic principles would apply to any ABI that passes some arguments in registers.
1 Note that this optimization only applies easily to effectively non-virtual functions - since only then can the compiler know the actual function implementation.
2 I guess that's what it's for - this undoes the push rax
at the top of the method so that rsp
has the correct value on return, but I don't know why the push/pop
pair needs to be in there in the first place. Other compilers use different strategies, such as add rsp, 8
and sub rsp,8
.
3 In practice, you aren't really going to disable inlining like this, but the failure to inline would happen just because the methods are in different compilation units. Because of the way godbolt works, I can't exactly do that, so disabling inlining has the same effect.
4 Oddly, I couldn't get clang
to stop inlining either function, either with attribute noinline
or with -fno-inline
.
5 In fact, often a few cycles more than the usual L1-hit latency of 4 cycles on Intel, due to store-forwarding of the recently written value.
6 In principle, on x86 at least, the one-instruction penalty can be eliminated by using an add
with a memory source operand, rather than a mov
from memory with a subsequent reg-reg add
and in fact clang and icc do exactly that. I don't think one approach dominates though - the gcc
approach with a separate mov
is better able to move the load off the critical path - initiating it early and then using it only in the last instruction, while the icc
approach adds 1 cycle to the critical path involving the mov
and the clang
approach seems the worst of all - stringing all the adds together into on long dependency chain on eax
which ends with the memory read.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With