Suppose you have a class that has private members which are accessed a lot in a program (such as in a loop which has to be fast). Imagine I have defined something like this:
class Foo
{
public:
Foo(unsigned set)
: vari(set)
{}
const unsigned& read_vari() const { return vari; }
private:
unsigned vari;
};
The reason I would like to do it this way is because, once the class is created, "vari" shouldn't be changed anymore. Thus, to minimize bug occurrence, "it seemed like a good idea at the time".
However, if I now need to call this function millions of times, I was wondering if there is an overhead and a slowdown instead of simply using:
struct Foo
{
unsigned vari;
};
So, was my first impule right in using a class, to avoid anyone mistakenly changing the value of the variable after it has been set by the constructor? Also, does this introduce a "penalty" in the form of a function call overhead. (Assuming I use optimization flags in the compiler, such as -O2 in GCC)?
They should come out to be the same. Remember that frustrating time you tried to use the operator[]
on a vector and gdb
just replied optimized out
? This is what will happen here. The compiler will not create a function call here but it will rather access the variable directly.
Let's have a look at the following code
struct foo{
int x;
int& get_x(){
return x;
}
};
int direct(foo& f){
return f.x;
}
int fnc(foo& f){
return f.get_x();
}
Which was compiled with g++ test.cpp -o test.s -S -O2
. The -S
flag tells the compiler to "Stop after the stage of compilation proper; do not assemble (quote from the g++ manpage)." This is what the compiler gives us:
_Z6directR3foo:
.LFB1026:
.cfi_startproc
movl (%rdi), %eax
ret
and
_Z3fncR3foo:
.LFB1027:
.cfi_startproc
movl (%rdi), %eax
ret
as you can see, no function call was made in the second case and they are both the same. Meaning there is no performance overhead in using the accessor method.
bonus: what happens if optimizations are turned off? same code, here are the results:
_Z6directR3foo:
.LFB1022:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl (%rax), %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
and
_Z3fncR3foo:
.LFB1023:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, %rdi
call _ZN3foo5get_xEv #<<<call to foo.get_x()
movl (%rax), %eax
leave
.cfi_def_cfa 7, 8
ret
As you can see without optimizations, the sturct is faster than the accessor, but who ships code without optimizations?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With