Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best strategy to compute average with high precision

I was comparing two algorithms computing the average of random numbers.

  • First algorithm sums all numbers and divides by the items count in the end
  • Second algorithm computes the average on every iteration and reuses the result when new data is received

I suppose there's nothing revolutionary here, and I'm not a mathematician so I can't put a name on those two algorithms.

Here is my code:

#include <iostream>
#include <iomanip>
#include <cstdlib>

class Average1
{
public:
    Average1() : total( 0 ), count( 0 ) {}

    void add( double value )
    {
        total += value;
        count++;
    }

    double average()
    {
        return total/count;
    }

private:
    double total;
    size_t count;
};

class Average2
{
public:
    Average2() : av( 0 ), count( 0 ) {}

    void add( double value )
    {
        av = (av*count + value)/(count+1);
        count++;
    }

    double average()
    {
        return av;
    }

private:
    double av;
    size_t count;
};

void compare()
{
    Average1 av1;
    Average2 av2;
    double temp;
    for ( size_t i = 0; i != 100000000; ++i )
    {
        temp = static_cast<double>(std::rand()) / static_cast<double>(RAND_MAX);
        av1.add( temp );
        av2.add( temp );
    }

    std::cout << std::setprecision(20) << av1.average() << std::endl;
    std::cout << std::setprecision(20) << av2.average() << std::endl;
}

int main()
{
    compare();
    return 0;
}

Output is:

0.50001084285722707801
0.50001084285744978875

The difference is certainly due to double type precision.

In the end, which one is the good method? Which one gives the real mathematical average (or closest to...)?

like image 253
jpo38 Avatar asked Dec 08 '22 22:12

jpo38


1 Answers

If you really want high-precision:

  • consider arbitrary precision arithmetic (e.g. with GMP)
  • consider Kahan summation algorithm (possible compiler issues)
  • consider Shewchuk's-algorithm (which is available in Python as math.fsum)

Edit: the python-docs in math.fsum also links to this Overview of approaches

like image 109
sascha Avatar answered Dec 19 '22 20:12

sascha