Arrow dereferencing p->m
is syntactic sugar for (*p).m
, which appears like it might involve two separate memory lookup operations--one to find the object on the heap and the second to then locate the member field offset.
This made me question whether there is any performance difference between these two code snippets. Assume classA
has 30+ disparate fields of various types which need to be accessed in various orders (not necessarily consecutively or contiguously):
Version 1:
void func(classA* ptr)
{
std::string s = ptr->field1;
int i = ptr->field2;
float f = ptr->field3;
// etc...
}
Version 2:
void func(classA* ptr)
{
classA &a = *ptr;
std::string s = a.field1;
int i = a.field2;
float f = a.field3;
// etc...
}
So my question is whether or not there is a difference in performance (even if very slight) between these two versions, or if the compiler is smart enough to make them equivalent (even if the different field accesses are interrupted by many lines of other code in between them, which I did not show here).
Arrow dereferencing p->m is syntactic sugar for (*p).m
That isn't generally true, but is true in the limited context in which you are asking.
which appears like it might involve two separate memory lookup operations--one to find the object on the heap and the second to then locate the member field offset.
Not at all. It is one to read the parameter or local variable holding the pointer and the second to access the member. But any reasonable optimizer would keep the pointer in a register in the code you showed, so no extra access.
But your alternate version also has a local pointer, so no difference anyway (at least in the direction you're asking about):
classA &a = *ptr;
Assuming the whole function is not being inlined or assuming for some other reason the compiler doesn't know exactly where ptr
points, the &
must use a pointer, so either the compiler can deduce it is safe for a
to be an alias of *ptr
so there is NO difference, or the compiler must make a
an alias of *copy_of_ptr
so the version using a &
is slower (not faster as you seem to have expected) by the cost of copying ptr
.
even if the different field accesses are interrupted by many lines of other code in between them, which I did not show here
That moves you toward the interesting case. If that intervening code could change ptr
then obviously the two versions behave differently. But what if a human can see that the intervening code can't change ptr
while a compiler can't see that: Then the two versions are semantically equal, but the compiler doesn't know that and the compiler may generate slower code for the version you tried to hand optimize by creation of a reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With