Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently calculate a moving Standard Deviation

Tags:

Below you can see my C# method to calculate Bollinger Bands for each point (moving average, up band, down band).

As you can see this method uses 2 for loops to calculate the moving standard deviation using the moving average. It used to contain an additional loop to calculate the moving average over the last n periods. This one I could remove by adding the new point value to total_average at the beginning of the loop and removing the i - n point value at the end of the loop.

My question now is basically: Can I remove the remaining inner loop in a similar way I managed with the moving average?

    public static void AddBollingerBands(SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)     {         double total_average = 0;          for (int i = 0; i < data.Count(); i++)         {             total_average += data.Values[i]["close"];              if (i >= period - 1)             {                 double total_bollinger = 0;                 double average = total_average / period;                  for (int x = i; x > (i - period); x--)                 {                     total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);                 }                  double stdev = Math.Sqrt(total_bollinger / period);                  data.Values[i]["bollinger_average"] = average;                 data.Values[i]["bollinger_top"] = average + factor * stdev;                 data.Values[i]["bollinger_bottom"] = average - factor * stdev;                  total_average -= data.Values[i - period + 1]["close"];             }         }     } 
like image 532
Christiaan Wevers Avatar asked Jan 31 '13 21:01

Christiaan Wevers


People also ask

What is a rolling standard deviation?

Moving Standard Deviation is a statistical measurement of market volatility. It makes no predictions of market direction, but it may serve as a confirming indicator. You specify the number of periods to use, and the study computes the standard deviation of prices from the moving average of the prices.

How do you calculate moving data?

Calculating Speed, Time, & Data. Calculate the transfer speed by dividing the amount of data by the transfer time. Plug the amount of data (A) and transfer time (T) to solve for the rate, or speed (S), into the equation S = A ÷ T. For example, you might have transferred 25 MB in 2 minutes.


2 Answers

The problem with approaches that calculate the sum of squares is that it and the square of sums can get quite large, and the calculation of their difference may introduce a very large error, so let's think of something better. For why this is needed, see the Wikipedia article on Algorithms for computing variance and John Cook on Theoretical explanation for numerical results)

First, instead of calculating the stddev let's focus on the variance. Once we have the variance, stddev is just the square root of the variance.

Suppose the data are in an array called x; rolling an n-sized window by one can be thought of as removing the value of x[0] and adding the value of x[n]. Let's denote the averages of x[0]..x[n-1] and x[1]..x[n] by µ and µ’ respectively. The difference between the variances of x[0]..x[n-1] and x[1]..x[n] is, after canceling out some terms and applying (a²-b²) = (a+b)(a-b):

Var[x[1],..,x[n]] - Var[x[0],..,x[n-1]]  = (\sum_1^n x[i]² - n µ’²)/(n-1) - (\sum_0^{n-1} x[i]² - n µ²)/(n-1) = (x[n]² - x[0]² - n(µ’² - µ²))/(n-1)  = (x[n]-µ’ + x[0]-µ)(x[n]-x[0])/(n-1) 

Therefore the variance is perturbed by something that doesn't require you to maintain the sum of squares, which is better for numerical accuracy.

You can calculate the mean and variance once in the beginning with a proper algorithm (Welford's method). After that, every time you have to replace a value in the window x[0] by another x[n] you update the average and variance like this:

new_Avg = Avg + (x[n]-x[0])/n new_Var = Var + (x[n]-new_Avg + x[0]-Avg)(x[n] - x[0])/(n-1) new_StdDev = sqrt(new_Var) 
like image 66
Joni Avatar answered Sep 29 '22 20:09

Joni


The answer is yes, you can. In the mid-80's I developed just such an algorithm (probably not original) in FORTRAN for a process monitoring and control application. Unfortunately, that was over 25 years ago and I do not remember the exact formulas, but the technique was an extension of the one for moving averages, with second order calculations instead of just linear ones.


After looking at your code some, I am think that I can suss out how I did it back then. Notice how your inner loop is making a Sum of Squares?:

            for (int x = i; x > (i - period); x--)             {                 total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);             } 

in much the same way that your average must have originally had a Sum of Values? The only two differences are the order (its power 2 instead of 1) and that you are subtracting the average each value before you square it. Now that might look inseparable, but in fact they can be separated:

SUM(i=1; n){ (v[i] - k)^2 } 

is

SUM(i=1..n){v[i]^2 -2*v[i]*k + k^2} 

which becomes

SUM(i=1..n){v[i]^2 -2*v[i]*k} + k^2*n 

which is

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]*k} + k^2*n 

which is also

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]}*k + k^2*n 

Now the first term is just a Sum of Squares, you handle that in the same way that you do the sum of Values for the average. The last term (k^2*n) is just the average squared times the period. Since you divide the result by the period anyway, you can just add the new average squared without the extra loop.

Finally, in the second term (SUM(-2*v[i]) * k), since SUM(v[i]) = total = k*n you can then change it into this:

-2 * k * k * n 

or just -2*k^2*n, which is -2 times the average squared, once the period (n) is divided out again. So the final combined formula is:

SUM(i=1..n){v[i]^2} - n*k^2 

or

SUM(i=1..n){values[i]^2} - period*(average^2) 

(be sure to check the validity of this, since I am deriving it off the top of my head)

And incorporating into your code should look something like this:

public static void AddBollingerBands(ref SortedList<DateTime, Dictionary<string, double>> data, int period, int factor) {     double total_average = 0;     double total_squares = 0;      for (int i = 0; i < data.Count(); i++)     {         total_average += data.Values[i]["close"];         total_squares += Math.Pow(data.Values[i]["close"], 2);          if (i >= period - 1)         {             double total_bollinger = 0;             double average = total_average / period;              double stdev = Math.Sqrt((total_squares - Math.Pow(total_average,2)/period) / period);             data.Values[i]["bollinger_average"] = average;             data.Values[i]["bollinger_top"] = average + factor * stdev;             data.Values[i]["bollinger_bottom"] = average - factor * stdev;              total_average -= data.Values[i - period + 1]["close"];             total_squares -= Math.Pow(data.Values[i - period + 1]["close"], 2);         }     } } 
like image 28
RBarryYoung Avatar answered Sep 29 '22 18:09

RBarryYoung