Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What can I assume about C/C++ compiler optimisations?

I would like to know how to avoid wasting my time and risking typos by re-hashing source code when I'm integrating legacy code, library code or sample code into my own codebase.

If I give a simple example, based on an image processing scenario, you might see what I mean.

It's actually not unusual to find I'm integrating a code snippet like this:

for (unsigned int y = 0; y < uHeight; y++)
{
    for (unsigned int x = 0; x < uWidth; x++)
    {
        // do something with this pixel ....
        uPixel = pPixels[y * uStride + x];
    }
}

Over time, I've become accustomed to doing things like moving unnecessary calculations out of the inner loop and maybe changing the postfix increments to prefix ...

for (unsigned int y = 0; y < uHeight; ++y)
{
    unsigned int uRowOffset = y * uStride;
    for (unsigned int x = 0; x < uWidth; ++x)
    {
        // do something with this pixel ....
        uPixel = pPixels[uRowOffset + x];
    }
}

Or, I might use pointer arithmetic, either by row ...

for (unsigned int y = 0; y < uHeight; ++y)
{
    unsigned char *pRow = pPixels + (y * uStride);
    for (unsigned int x = 0; x < uWidth; ++x)
    {
        // do something with this pixel ....
        uPixel = pRow[x];
    }
}

... or by row and column ... so I end up with something like this

unsigned char *pRow = pPixels;
for (unsigned int y = 0; y < uHeight; ++y)
{
    unsigned char *pPixel = pRow;
    for (unsigned int x = 0; x < uWidth; ++x)
    {
        // do something with this pixel ....
        uPixel = *pPixel++;
    }

    // next row
    pRow += uStride;
}

Now, when I write from scratch, I'll habitually apply my own "optimisations" but I'm aware that the compiler will also be doing things like:

  • Moving code from inside loops to outside loops
  • Changing postfix increments to prefix
  • Lots of other stuff that I have no idea about

Bearing in mind that every time I mess with a piece of working, tested code in this way, I not only cost myself some time but I also run the risk that I'll introduce bugs with finger trouble or whatever (the above examples are simplified). I'm aware of "premature optimisation" and also other ways of improving performance by designing better algorithms, etc. but for the situations above I'm creating building-blocks that will be used in larger pipelined type of apps, where I can't predict what the non-functional requirements might be so I just want the code as fast and tight as is reasonable within time limits (I mean the time I spend tweaking the code).

So, my question is: Where can I find out what compiler optimisations are commonly supported by "modern" compilers. I'm using a mixture of Visual Studio 2008 and 2012, but would be interested to know if there are differences with alternatives e.g. Intel's C/C++ Compiler. Can anyone shed some insight and/or point me at a useful web link, book or other reference?

EDIT
Just to clarify my question

  • The optimisations I showed above were simple examples, not a complete list. I know that it's pointless (from a performance point of view) to make those specific changes because the compiler will do it anyway.
  • I'm specifically looking for information about what optimisations are provided by the compilers I'm using.
like image 619
Roger Rowland Avatar asked Mar 23 '13 08:03

Roger Rowland


1 Answers

I would expect most of the optimizations that you include as examples to be a waste of time. A good optimizing compiler should be able to do all of this for you.

I can offer three suggestions by way of practical advice:

  1. Profile your code in the context of a real application processing real data. If you can't, come up with some synthetic tests that you think would closely mimic the final system.
  2. Only optimize code that you have demonstrated through profiling to be a bottleneck.
  3. If you are convinced that a piece of code needs optimization, don't just assume that factoring invariant expression out of a loop would improve performance. Always benchmark, optionally looking at the generated assembly to gain further insight.

The above advice applies to any optimizations. However, the last point is particularly relevant to low-level optimizations. They are a bit of a black art since there are a lot of relevant architectural details involved: memory hierarchy and bandwidth, instruction pipelining, branch prediction, the use of SIMD instructions etc.

I think it's better to rely on the compiler writer having a good knowledge of the target architecture than to try and outsmart them.

From time to time you will find through profiling that you need to optimize things by hand. However, these instances will be fairly rare, which will allow you to spend a good deal of energy on things that will actually make a difference.

In the meantime, focus on writing correct and maintainable code.

like image 109
NPE Avatar answered Nov 20 '22 23:11

NPE