Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you calculate the average of a set of circular data? [closed]

People also ask

How do you find the average of a set?

How to Calculate Average. The average of a set of numbers is simply the sum of the numbers divided by the total number of values in the set. For example, suppose we want the average of 24 , 55 , 17 , 87 and 100 . Simply find the sum of the numbers: 24 + 55 + 17 + 87 + 100 = 283 and divide by 5 to get 56.6 .

How do you find the new average of a set of data?

To find the arithmetic mean of a data set, all you need to do is add up all the numbers in the data set and then divide the sum by the total number of values.


Compute unit vectors from the angles and take the angle of their average.


This question is examined in detail in the book: "Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc. as mentioned at http://catless.ncl.ac.uk/Risks/7.44.html#subj4 by Bruce Karsh.

A good way to estimate an average angle, A, from a set of angle measurements a[i] 0<=i

                   sum_i_from_1_to_N sin(a[i])
a = arctangent ---------------------------
                   sum_i_from_1_to_N cos(a[i])

The method given by starblue is computationally equivalent, but his reasons are clearer and probably programmatically more efficient, and also work well in the zero case, so kudos to him.

The subject is now explored in more detail on Wikipedia, and with other uses, like fractional parts.


I see the problem - for example, if you have a 45' angle and a 315' angle, the "natural" average would be 180', but the value you want is actually 0'.

I think Starblue is onto something. Just calculate the (x, y) cartesian coordinates for each angle, and add those resulting vectors together. The angular offset of the final vector should be your required result.

x = y = 0
foreach angle {
    x += cos(angle)
    y += sin(angle)
}
average_angle = atan2(y, x)

I'm ignoring for now that a compass heading starts at north, and goes clockwise, whereas "normal" cartesian coordinates start with zero along the X axis, and then go anti-clockwise. The maths should work out the same way regardless.


FOR THE SPECIAL CASE OF TWO ANGLES:

The answer ( (a + b) mod 360 ) / 2 is WRONG. For angles 350 and 2, the closest point is 356, not 176.

The unit vector and trig solutions may be too expensive.

What I've got from a little tinkering is:

diff = ( ( a - b + 180 + 360 ) mod 360 ) - 180
angle = (360 + b + ( diff / 2 ) ) mod 360
  • 0, 180 -> 90 (two answers for this: this equation takes the clockwise answer from a)
  • 180, 0 -> 270 (see above)
  • 180, 1 -> 90.5
  • 1, 180 -> 90.5
  • 20, 350 -> 5
  • 350, 20 -> 5 (all following examples reverse properly too)
  • 10, 20 -> 15
  • 350, 2 -> 356
  • 359, 0 -> 359.5
  • 180, 180 -> 180

ackb is right that these vector based solutions cannot be considered true averages of angles, they are only an average of the unit vector counterparts. However, ackb's suggested solution does not appear to mathematically sound.

The following is a solution that is mathematically derived from the goal of minimising (angle[i] - avgAngle)^2 (where the difference is corrected if necessary), which makes it a true arithmetic mean of the angles.

First, we need to look at exactly which cases the difference between angles is different to the difference between their normal number counterparts. Consider angles x and y, if y >= x - 180 and y <= x + 180, then we can use the difference (x-y) directly. Otherwise, if the first condition is not met then we must use (y+360) in the calculation instead of y. Corresponding, if the second condition is not met then we must use (y-360) instead of y. Since the equation of the curve we are minimising only changes at the points where these inequalities change from true to false or vice versa, we can separate the full [0,360) range into a set of segments, separated by these points. Then, we only need to find the minimum of each of these segments, and then the minimum of each segment's minimum, which is the average.

Here's an image demonstrating where the problems occur in calculating angle differences. If x lies in the gray area then there will be a problem.

Angle comparisons

To minimise a variable, depending on the curve, we can take the derivative of what we want to minimise and then we find the turning point (which is where the derivative = 0).

Here we will apply the idea of minimise the squared difference to derive the common arithmetic mean formula: sum(a[i])/n. The curve y = sum((a[i]-x)^2) can be minimised in this way:

y = sum((a[i]-x)^2)
= sum(a[i]^2 - 2*a[i]*x + x^2)
= sum(a[i]^2) - 2*x*sum(a[i]) + n*x^2

dy\dx = -2*sum(a[i]) + 2*n*x

for dy/dx = 0:
-2*sum(a[i]) + 2*n*x = 0
-> n*x = sum(a[i])
-> x = sum(a[i])/n

Now applying it to curves with our adjusted differences:

b = subset of a where the correct (angular) difference a[i]-x c = subset of a where the correct (angular) difference (a[i]-360)-x cn = size of c d = subset of a where the correct (angular) difference (a[i]+360)-x dn = size of d

y = sum((b[i]-x)^2) + sum(((c[i]-360)-b)^2) + sum(((d[i]+360)-c)^2)
= sum(b[i]^2 - 2*b[i]*x + x^2)
  + sum((c[i]-360)^2 - 2*(c[i]-360)*x + x^2)
  + sum((d[i]+360)^2 - 2*(d[i]+360)*x + x^2)
= sum(b[i]^2) - 2*x*sum(b[i])
  + sum((c[i]-360)^2) - 2*x*(sum(c[i]) - 360*cn)
  + sum((d[i]+360)^2) - 2*x*(sum(d[i]) + 360*dn)
  + n*x^2
= sum(b[i]^2) + sum((c[i]-360)^2) + sum((d[i]+360)^2)
  - 2*x*(sum(b[i]) + sum(c[i]) + sum(d[i]))
  - 2*x*(360*dn - 360*cn)
  + n*x^2
= sum(b[i]^2) + sum((c[i]-360)^2) + sum((d[i]+360)^2)
  - 2*x*sum(x[i])
  - 2*x*360*(dn - cn)
  + n*x^2

dy/dx = 2*n*x - 2*sum(x[i]) - 2*360*(dn - cn)

for dy/dx = 0:
2*n*x - 2*sum(x[i]) - 2*360*(dn - cn) = 0
n*x = sum(x[i]) + 360*(dn - cn)
x = (sum(x[i]) + 360*(dn - cn))/n

This alone is not quite enough to get the minimum, while it works for normal values, that has an unbounded set, so the result will definitely lie within set's range and is therefore valid. We need the minimum within a range (defined by the segment). If the minimum is less than our segment's lower bound then the minimum of that segment must be at the lower bound (because quadratic curves only have 1 turning point) and if the minimum is greater than our segment's upper bound then the segment's minimum is at the upper bound. After we have the minimum for each segment, we simply find the one that has the lowest value for what we're minimising (sum((b[i]-x)^2) + sum(((c[i]-360)-b)^2) + sum(((d[i]+360)-c)^2)).

Here is an image to the curve, which shows how it changes at the points where x=(a[i]+180)%360. The data set is in question is {65,92,230,320,250}.

Curve

Here is an implementation of the algorithm in Java, including some optimisations, its complexity is O(nlogn). It can be reduced to O(n) if you replace the comparison based sort with a non comparison based sort, such as radix sort.

static double varnc(double _mean, int _n, double _sumX, double _sumSqrX)
{
    return _mean*(_n*_mean - 2*_sumX) + _sumSqrX;
}
//with lower correction
static double varlc(double _mean, int _n, double _sumX, double _sumSqrX, int _nc, double _sumC)
{
    return _mean*(_n*_mean - 2*_sumX) + _sumSqrX
            + 2*360*_sumC + _nc*(-2*360*_mean + 360*360);
}
//with upper correction
static double varuc(double _mean, int _n, double _sumX, double _sumSqrX, int _nc, double _sumC)
{
    return _mean*(_n*_mean - 2*_sumX) + _sumSqrX
            - 2*360*_sumC + _nc*(2*360*_mean + 360*360);
}

static double[] averageAngles(double[] _angles)
{
    double sumAngles;
    double sumSqrAngles;

    double[] lowerAngles;
    double[] upperAngles;

    {
        List<Double> lowerAngles_ = new LinkedList<Double>();
        List<Double> upperAngles_ = new LinkedList<Double>();

        sumAngles = 0;
        sumSqrAngles = 0;
        for(double angle : _angles)
        {
            sumAngles += angle;
            sumSqrAngles += angle*angle;
            if(angle < 180)
                lowerAngles_.add(angle);
            else if(angle > 180)
                upperAngles_.add(angle);
        }


        Collections.sort(lowerAngles_);
        Collections.sort(upperAngles_,Collections.reverseOrder());


        lowerAngles = new double[lowerAngles_.size()];
        Iterator<Double> lowerAnglesIter = lowerAngles_.iterator();
        for(int i = 0; i < lowerAngles_.size(); i++)
            lowerAngles[i] = lowerAnglesIter.next();

        upperAngles = new double[upperAngles_.size()];
        Iterator<Double> upperAnglesIter = upperAngles_.iterator();
        for(int i = 0; i < upperAngles_.size(); i++)
            upperAngles[i] = upperAnglesIter.next();
    }

    List<Double> averageAngles = new LinkedList<Double>();
    averageAngles.add(180d);
    double variance = varnc(180,_angles.length,sumAngles,sumSqrAngles);

    double lowerBound = 180;
    double sumLC = 0;
    for(int i = 0; i < lowerAngles.length; i++)
    {
        //get average for a segment based on minimum
        double testAverageAngle = (sumAngles + 360*i)/_angles.length;
        //minimum is outside segment range (therefore not directly relevant)
        //since it is greater than lowerAngles[i], the minimum for the segment
        //must lie on the boundary lowerAngles[i]
        if(testAverageAngle > lowerAngles[i]+180)
            testAverageAngle = lowerAngles[i];

        if(testAverageAngle > lowerBound)
        {
            double testVariance = varlc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,i,sumLC);

            if(testVariance < variance)
            {
                averageAngles.clear();
                averageAngles.add(testAverageAngle);
                variance = testVariance;
            }
            else if(testVariance == variance)
                averageAngles.add(testAverageAngle);
        }

        lowerBound = lowerAngles[i];
        sumLC += lowerAngles[i];
    }
    //Test last segment
    {
        //get average for a segment based on minimum
        double testAverageAngle = (sumAngles + 360*lowerAngles.length)/_angles.length;
        //minimum is inside segment range
        //we will test average 0 (360) later
        if(testAverageAngle < 360 && testAverageAngle > lowerBound)
        {
            double testVariance = varlc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,lowerAngles.length,sumLC);

            if(testVariance < variance)
            {
                averageAngles.clear();
                averageAngles.add(testAverageAngle);
                variance = testVariance;
            }
            else if(testVariance == variance)
                averageAngles.add(testAverageAngle);
        }
    }


    double upperBound = 180;
    double sumUC = 0;
    for(int i = 0; i < upperAngles.length; i++)
    {
        //get average for a segment based on minimum
        double testAverageAngle = (sumAngles - 360*i)/_angles.length;
        //minimum is outside segment range (therefore not directly relevant)
        //since it is greater than lowerAngles[i], the minimum for the segment
        //must lie on the boundary lowerAngles[i]
        if(testAverageAngle < upperAngles[i]-180)
            testAverageAngle = upperAngles[i];

        if(testAverageAngle < upperBound)
        {
            double testVariance = varuc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,i,sumUC);

            if(testVariance < variance)
            {
                averageAngles.clear();
                averageAngles.add(testAverageAngle);
                variance = testVariance;
            }
            else if(testVariance == variance)
                averageAngles.add(testAverageAngle);
        }

        upperBound = upperAngles[i];
        sumUC += upperBound;
    }
    //Test last segment
    {
        //get average for a segment based on minimum
        double testAverageAngle = (sumAngles - 360*upperAngles.length)/_angles.length;
        //minimum is inside segment range
        //we test average 0 (360) now           
        if(testAverageAngle < 0)
            testAverageAngle = 0;

        if(testAverageAngle < upperBound)
        {
            double testVariance = varuc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,upperAngles.length,sumUC);

            if(testVariance < variance)
            {
                averageAngles.clear();
                averageAngles.add(testAverageAngle);
                variance = testVariance;
            }
            else if(testVariance == variance)
                averageAngles.add(testAverageAngle);
        }
    }


    double[] averageAngles_ = new double[averageAngles.size()];
    Iterator<Double> averageAnglesIter = averageAngles.iterator();
    for(int i = 0; i < averageAngles_.length; i++)
        averageAngles_[i] = averageAnglesIter.next();


    return averageAngles_;
}

The arithmetic mean of a set of angles may not agree with your intuitive idea of what the average should be. For example, the arithmetic mean of the set {179,179,0,181,181} is 216 (and 144). The answer you immediately think of is probably 180, however it is well known that the arithmetic mean is heavily affected by edge values. You should also remember that angles are not vectors, as appealing as that may seem when dealing with angles sometimes.

This algorithm does of course also apply to all quantities that obey modular arithmetic (with minimal adjustment), such as the time of day.

I would also like to stress that even though this is a true average of angles, unlike the vector solutions, that does not necessarily mean it is the solution you should be using, the average of the corresponding unit vectors may well be the value you actually should to be using.


You have to define average more accurately. For the specific case of two angles, I can think of two different scenarios:

  1. The "true" average, i.e. (a + b) / 2 % 360.
  2. The angle that points "between" the two others while staying in the same semicircle, e.g. for 355 and 5, this would be 0, not 180. To do this, you need to check if the difference between the two angles is larger than 180 or not. If so, increment the smaller angle by 360 before using the above formula.

I don't see how the second alternative can be generalized for the case of more than two angles, though.