Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you correctly calculate a running average of large sets of numbers with 32-bit floating point?

I'm writing a path tracer and have to collect an average over a large number of samples per pixel. I get significant visual differences between a 1024-samples run and a 16384-samples run; the 16384-samples run is darker. I conjecture that this is because the 16384-samples image runs into floating-point precision errors. I average the color values by dividing each value by 16384, then adding them together.

Is there a way to average a large, difficult to compute set of numbers with a known magnitude, while minimizing rounding error? Preferably without requiring non-constant memory, and definitely without discarding any samples?

like image 577
FeepingCreature Avatar asked Sep 09 '11 12:09

FeepingCreature


2 Answers

You probably want the Kahan summation algorithm. This is a simple and effective way of minimising cumulative rounding errors when summing a large number of points with finite precision floating point arithmetic.

like image 180
Paul R Avatar answered Oct 31 '22 19:10

Paul R


Since you're dividing by a power of 2 and your numbers are not ultra-small, this step shouldn't have any influence on the accuracy of the result. You're just subtracting 14 from the exponent.

What counts is the number of bits in your samples.

Floats give you 24 bits of precision. If you have 2^14 = 16384 samples, then when you add them all up, you'll gradually lose precision until at some point anyhing after the 24-14=10th bit gets lost. In other words: at that point you're only keeping about 3 decimal digits.

Would it be possible to use an int as an accumulator, or even a uint? That way you'll keep 8 extra bits, twice as much as the difference between 1024 and 16384 samples.

There is a second, entirely different option. I don't know what the range of your samples is, but if they are around the same size, you can subtract an approximate average from each value, average the differences, and add the approximate average back on at the end.

How much you gain by this method depends on how good your initial approximation of the average is, and how close the values are to the average. So I'd say it's probably less reliable in your situation.

like image 31
Jeffrey Sax Avatar answered Oct 31 '22 21:10

Jeffrey Sax