Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maintaining Floating Point Accuracy with a Running Average

I need to calculate the mean-squared error of a 16-bit operation for an arbitrary number of data points (upwards of 100 million). I decided to go with a running average so I wouldn't have to worry about overflow from adding a large number of squared errors. At 100 million samples I had problems with floating point precision (inaccurate results) so I moved to double.

Here is my code

int iDifference = getIdeal() - getValue();

m_iCycles++;


// calculate the running MSE as

// http://en.wikipedia.org/wiki/Moving_average

// MSE(i + 1) = MSE(i) + (E^2 - MSE(i))/(i + 1)

m_dMSE = m_dMSE + ((pow((double)iDifference,2) - m_dMSE) / (double)m_iCycles);

Is there a better way to implement this to maintain accuracy? I considered normalizing the MSE to one and simply keeping a sum with a final division on completion to calculate the average.

like image 250
Jason George Avatar asked Jan 27 '11 19:01

Jason George


2 Answers

You might want to look at the Kahan Summation Algorithm - it's not exactly what you need here but it solves a very similar problem and you may be able to adapt it to your needs.

like image 80
Paul R Avatar answered Oct 30 '22 06:10

Paul R


Floating point numbers don't overflow in this kind of situation, they only lose precision. So there are no advantages of a running average over a running total here. The consequence is the same whether the running total or the denominator grows.

To maintain precision in a running total, keep subtotals instead of a single total. Just keep adding to a subtotal until adding one more would cause overflow. Then move on to the next subtotal. Since they are all the same order of magnitude (in base 2), optimal precision may be achieved by converting to floating point and using a pairwise accumulation into one final total.

// first = errors, second = counter
typedef pair< vector< uint32_t >, uint32_t > running_subtotals;

void accumulate_error( uint32_t error, running_subtotals &acc ) {
    ( numeric_limits< uint32_t >::max() - error < acc.first.back()?
        * acc.first.insert( acc.first.end(), 0 ) : acc.first.back() )
        += error; // add error to current subtotal, or new one if needed
    ++ acc.second; // increment counter
}

double get_average_error( running_subtotals const &total ) {
    vector< double > acc( total.first.begin(), total.first.end() );
    while ( acc.size() != 1 ) {
        if ( acc.size() % 2 ) acc.push_back( 0 );
        for ( size_t index = 0; index < acc.size() / 2; ++ index ) {
            acc[ index ] = acc[ index * 2 ] + acc[ index * 2 + 1 ];
        }
        acc.resize( acc.size() / 2 );
    }
    return acc.front() / total.second;
}
like image 36
Potatoswatter Avatar answered Oct 30 '22 05:10

Potatoswatter