optimization of access to members in c++

Question

I'm running into an inconsistent optimization behavior with different compilers for the following code:

class tester
{
public:
    tester(int* arr_, int sz_)
        : arr(arr_), sz(sz_)
    {}

    int doadd()
    {
        sm = 0;
        for (int n = 0; n < 1000; ++n) 
        {
            for (int i = 0; i < sz; ++i)
            {
                sm += arr[i];
            }
        }
        return sm;
    }
protected:
    int* arr;
    int sz;
    int sm;
};

The doadd function simulates some intensive access to members (ignore the overflows in addition for this question). Compared with similar code implemented as a function:

int arradd(int* arr, int sz)
{
    int sm = 0;
    for (int n = 0; n < 1000; ++n) 
    {
        for (int i = 0; i < sz; ++i)
        {
            sm += arr[i];
        }
    }
    return sm;
}

The doadd method runs about 1.5 times slower than the arradd function when compiled in Release mode with Visual C++ 2008. When I modify the doadd method to be as follows (aliasing all members with locals):

int doadd()
{
    int mysm = 0;
    int* myarr = arr;
    int mysz = sz;
    for (int n = 0; n < 1000; ++n) 
    {
        for (int i = 0; i < mysz; ++i)
        {
            mysm += myarr[i];
        }
    }
    sm = mysm;
    return sm;
}

Runtimes become roughly the same. Am I right in concluding that this is a missing optimization by the Visual C++ compiler? g++ seems to do it better and run both the member function and the normal function at the same speed when compiling with -O2 or -O3.

The benchmarking is done by invoking the doadd member and arradd function on some sufficiently large array (a few millions of integers in size).

EDIT: Some fine-grained testing shows that the main culprit is the sm member. Replacing all others by local versions still makes the runtime long, but once I replace sm by mysm the runtime becomes equal to the function version.

alt text

Resolution

Dissapointed with the answers (sorry guys), I shaked off my laziness and dove into the disassembly listings for this code. My answer below summarizes the findings. In short: it has nothing to do with aliasing, it has all to do with loop unrolling, and with some strange heuristics MSVC applies when deciding which loop to unroll.

Paul R · Accepted Answer

It may be an aliasing issue - the compiler can not know that the instance variable sm will never be pointed at by arr, so it has to treat sm as if it were effectively volatile, and save it on every iteration. You could make sm a different type to test this hypothesis. Or just use a temporary local sum (which will get cached in a register) and assign it to sm at the end.

optimization of access to members in c++

Tags:

c++

optimization

g++

visual-c++

Resolution

Eli Bendersky

1 Answers

Paul R

Recent Activity

Donate For Us

optimization of access to members in c++

Tags:

c++

optimization

g++

visual-c++

Resolution

Eli Bendersky

1 Answers

Paul R

Related questions

Recent Activity

Donate For Us