Algorithm fast when just computing - slow when writing computed data to output buffer

Question

I have an algorithm that does some computations on elements of an array. I'd like to re-use the input-data buffer to write the results into it. In terms of the data traversal pattern, it looks almost exactly like this (the only other thing happening in that for-loop are increments to some pointers and counting variables):

int *inputData = /*input data is here */;
for(int i=0;i<some_value;++i)
{
      int result = do_some_computations(*inputData);
      *inputData = result;
      ++inputData;
}

Now the interesting part: inputData contains about six million elements. If I comment out the write to the inputData array, so the algorithm looks basically like this:

int *inputData = /*input data is here */;
for(int i=0;i<some_value;++i)
{
      int result = do_some_computations(*inputData);
     // *inputData = result;
      ++inputData;
}

The algorithm, over a series of ~100 measurements, takes on average about 7 milliseconds. However, if I leave the write in, the algorithm takes about 55 milliseconds. Writing "*inputData = do_some_computations(*inputData);" instead of the way it is now makes no difference in performance. Using a separate outputBuffer makes no difference as well.

This is bad. The performance of this algorithm is absolutely critical to the requirements of the program. I was very happy with 7ms, however I am very unhappy with 55 ms.

Why does this single write-back cause such a large overhead, and how can I fix it?

Skizz · Accepted Answer

Your code is being optimised to nothing in the non-write back version. To show this, assuming a 5GHz single core CPU then:-

7ms = 35,000,000 cycles

6 million items = 35/6 = 5.8 cycles per item = not a lot of work being done

For the slow version:-

55ms = 275,000,000 cycles

6 million items = 275/6 = 45.8 cycles per item = far more work per item

If you want to verify this, look at the assembly output from the compiler.

Algorithm fast when just computing - slow when writing computed data to output buffer

Tags:

c++

optimization

TravisG

1 Answers

Skizz

Recent Activity

Donate For Us

Algorithm fast when just computing - slow when writing computed data to output buffer

Tags:

c++

optimization

TravisG

1 Answers

Skizz

Related questions

Recent Activity

Donate For Us