Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance difference between multiple pointer dereferences vs references

Tags:

c++

This was an interview question asked from me a few months back:

Which of the following functions will execute faster, Foo1 or Foo2?

void Foo(SomeObjectArray** array, unsigned int size)
{
    for (int i = 0; i < size; i++)
    {
        if (((*array) + i) != NULL)
        {
            ((*array) + i)->Operation1();
            ((*array) + i)->Operation2();
            ((*array) + i)->Operation3();
            ((*array) + i)->Operation4();
            ((*array) + i)->Operation5();
            ((*array) + i)->Operation6();
    }
}

void Foo(SomeObjectArray** array, unsigned int size)
{
    for (int i = 0; i < size; i++)
    {
        if (*((*array) + i) != NULL)
        {
            Object& obj = *((*array) + i);
            obj.Operation1();
            obj.Operation2();
            obj.Operation3();
            obj.Operation4();
            obj.Operation5();
            obj.Operation6();
        }
    }
}

Note that this is from memory so I don't remember the code exactly, but the general idea is the same. One function is using the pointer while the other uses a reference (it may have a pointer to an array like in the above code, but I don't remember exactly). I said I am not sure, and will have to profile the code to find out, but if I had to guess Foo2 'may' be faster. They were not impressed...

This nagged me a couple of times here and there when I ran across code similar to this (or wrote it) and would wonder what I should do in such a case.

I know that...

  1. This is a micro optimization
  2. The compiler will most likely optimize it

EDIT: I have changed the code slightly so that now it is checking for a NULL pointer.

like image 586
Samaursa Avatar asked Apr 28 '11 16:04

Samaursa


People also ask

Which is faster pointer or reference?

It's much faster and memory-efficient to copy a pointer than to copy many of the things a pointer is likely to point to. A reference is stored in as many bytes as required to hold an address on the computer. This often makes reference much smaller than the things they refer to.

Is pointer dereferencing expensive?

Dereferencing, when translated into machine code, can mean different things depending on what you do with the dereferenced object. Accessing a single member of a class through a pointer is typically cheap.

Are references safer than pointers?

In C++, Reference variables are safer than pointers because reference variables must be initialized and they cannot be changed to refer to something else once they are initialized.

Is pass by value efficient?

In short: It is almost always more efficient to pass native types (int, float, double) by value than by reference. Not only because a pointer is - in most cases - bigger or as big as the native datatype, but also because it is much harder for the optimizer to optimize reference parameters than value parameters.


3 Answers

I think this is a pretty interesting question, and I saw a lot of speculation about what a compiler might do, but I wanted to take a closer look and find out for sure. So I took e.James' program and ran it through GCC to get the assembly. I should say that I don't really know assembly, so somebody correct me if I'm wrong, but I think we can reasonably infer what's going on. :)

Compiling with -O0 (no optimization)

For Foo1, we see that the array offset is calculated before each function call:

movl    8(%ebp), %eax
movl    (%eax), %edx
movl    -4(%ebp), %eax
leal    (%edx,%eax), %eax
movl    %eax, (%esp)
call    __ZN10SomeObject10Operation1Ev

That's for all six method calls, just using different method names. Foo2 has a little bit of setup code to get the reference

movl    8(%ebp), %eax
movl    (%eax), %edx
movl    -4(%ebp), %eax
leal    (%edx,%eax), %eax
movl    %eax, -8(%ebp)

And then six of these, which looks like just a stack pointer push and a function call:

movl    -8(%ebp), %eax
movl    %eax, (%esp)
call    __ZN10SomeObject10Operation1Ev

Pretty much what we would expect with no optimization. The output was

Foo1: 18472
Foo2: 17684

Compiling with -O1 (minimal optimization)

Foo1 is a little more efficient, but still adding up the array offset each time:

movl    %esi, %eax
addl    (%ebx), %eax
movl    %eax, (%esp)
call    __ZN10SomeObject10Operation1Ev

Foo2 looks saves the value of ebx (addl (%edi), %ebx), and then makes these calls:

movl    %ebx, (%esp)
call    __ZN10SomeObject10Operation1Ev

The timings here were

Foo1: 4979
Foo2: 4977

Compiling with -O2 (moderate optimization)

When compiling with -O2, GCC just got rid of the whole thing and each call to Foo1 or Foo2 just resulted in adding 594 to dummy (99 increments * 6 calls = 594 increments):

imull   $594, %eax, %eax
addl    %eax, _dummy

There were no calls the the object's methods, although those methods remained in the code. As we might expect, the times here were

Foo1: 1
Foo2: 0

I think this tells us that Foo2 is a little faster without optimizations, but really it's a moot point because once it starts optimizing, the compiler is just moving around a couple of longs between the stack and the registers.

like image 96
Steve Blackwell Avatar answered Oct 15 '22 22:10

Steve Blackwell


Strictly speaking and with no optimizations at all I'd say Foo2 is faster cause Foo1 has to calculate the indirection ptr each time, but thats not gonna happen anywhere.

I'd say the compiler will optimize it and leave it the same.
Looks like there is plenty of room for the compiler, i and array don't change for the whole block on each iteration, so it could optimize it putting the pointer into a register, exactly the same it would do for the reference.

like image 43
Arkaitz Jimenez Avatar answered Oct 15 '22 20:10

Arkaitz Jimenez


Likely the compiler will optimize them to be the same, given the common sub-expressions on each line. No guarantees though.

There aren't any rational conclusions you can reach by arm-chair reasoning like this, with today's compilers and processors. The only way to know is to try it and time it. If someone didn't make that clear in their interview answer, it would be an automatic fail from me.

like image 23
Mark Ransom Avatar answered Oct 15 '22 21:10

Mark Ransom