void do_something() {....}
struct dummy
{
//even I dont call this, compiler will call it fall me, they need it
void call_do_something() { this->do_something_member(); }
void do_something() {....}
};
According what I know, every class or struct in C++ will implicity call this pointer when you want to access the data member or the member function of the class, would this bring performance penalty to C++?
What I mean is
int main()
{
do_something(); //don't need this pointer
dummy().call_do_something(); //assume the inline is prefect
return 0;
}
call_do_something need a this pointer to call the member function, but the C like do_something don't need this pointer, would this pointer bring some performance penalty when compare to the C like function?
I have no meaning to do any micro optimization since it would cause me so much time but always don't bring me good result, I always follow the rule of "measure, don't think". I want to know this pointer would bring performance penalty or not because of curiosity.
Depends on the situation, but usually, if you've got optimizations turned on, it shouldn't be any more expensive than the C version. The only time you really "pay" for this
and other features is when you're using inheritance and virtual functions. Other than that, the compiler is smart enough to not waste time on this
in a function you're not using it. Consider the following:
#include <iostream>
void globalDoStuff()
{
std::cout << "Hello world!\n";
}
struct Dummy
{
void doStuff() { callGlobalDoStuff(); }
void callGlobalDoStuff() { globalDoStuff(); }
};
int main()
{
globalDoStuff();
Dummy d;
d.doStuff();
}
Compiled with GCC optimization level O3
, I get the following disassembly (cutting the extra junk and just showing main()
):
_main:
0000000100000dd0 pushq %rbp
0000000100000dd1 movq %rsp,%rbp
0000000100000dd4 pushq %r14
0000000100000dd6 pushq %rbx
0000000100000dd7 movq 0x0000025a(%rip),%rbx
0000000100000dde leaq 0x000000d1(%rip),%r14
0000000100000de5 movq %rbx,%rdi
0000000100000de8 movq %r14,%rsi
0000000100000deb callq 0x100000e62 # bypasses globalDoStuff() and just prints "Hello world!\n"
0000000100000df0 movq %rbx,%rdi
0000000100000df3 movq %r14,%rsi
0000000100000df6 callq 0x100000e62 # bypasses globalDoStuff() and just prints "Hello world!\n"
0000000100000dfb xorl %eax,%eax
0000000100000dfd popq %rbx
0000000100000dfe popq %r14
0000000100000e00 popq %rbp
0000000100000e01 ret
Notice it completely optimized away both the Dummy
and globalDoStuff()
and just replaced it with the body of globalDoStuff()
. globalDoStuff()
isn't ever even called, and no Dummy
is ever constructed. Instead, the compiler/optimizer replaces that code with two system calls to print out "Hello world!\n"
directly. The lesson is that the compiler and optimizer is pretty dang smart, and in general you won't pay for what you don't need.
On the other hand, imagine you have a member function that manipulates a member variable of Dummy
. You might think this has a penalty compared to a C function, right? Probably not, because the C function needs a pointer to an object to modify, which, when you think about it, is exactly what the this
pointer is to begin with.
So in general you won't pay extra for this
in C++ compared to C. Virtual functions may have a (small) penalty as it has to look up the proper function to call, but that's not the case we're considering here.
If you don't turn on optimizations in your compiler, then yeah, sure, there might be a penalty involved, but... why would you compare non-optimized code?
#include <iostream>
#include <stdint.h>
#include <limits.h>
struct Dummy {
uint32_t counter;
Dummy(): counter(0) {}
void do_something() {
counter++;
}
};
uint32_t counter = 0;
void do_something() { counter++; }
int main(int argc, char **argv) {
Dummy dummy;
if (argc == 1) {
for (int i = 0; i < INT_MAX - 1; i++) {
for (int j = 0; j < 1; j++) {
do_something();
}
}
} else {
for (int i = 0; i < INT_MAX - 1; i++) {
for (int j = 0; j < 1; j++) {
dummy.do_something();
}
}
counter = dummy.counter;
}
std::cout << counter << std::endl;
return 0;
}
Average of 10 runs on gcc version 4.3.5 (Debian 4.3.5-4), 64bit, without any flags:
with global counter: 0m15.062s
with dummy object: 0m21.259s
If I modify the code like this as Lyth suggested:
#include <iostream>
#include <stdint.h>
#include <limits.h>
uint32_t counter = 0;
struct Dummy {
void do_something() {
counter++;
}
};
void do_something() { counter++; }
int main(int argc, char **argv) {
Dummy dummy;
if (argc == 1) {
for (int i = 0; i < INT_MAX; i++) {
do_something();
}
} else {
for (int i = 0; i < INT_MAX; i++) {
dummy.do_something();
}
}
std::cout << counter << std::endl;
return 0;
}
Then, strangely,
with global counter: 0m12.062s
with dummy object: 0m11.860s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With