Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Smoothing values over time: moving average or something better?

I'm coding something at the moment where I'm taking a bunch of values over time from a hardware compass. This compass is very accurate and updates very often, with the result that if it jiggles slightly, I end up with the odd value that's wildly inconsistent with its neighbours. I want to smooth those values out.

Having done some reading around, it would appear that what I want is a high-pass filter, a low-pass filter or a moving average. Moving average I can get down with, just keep a history of the last 5 values or whatever, and use the average of those values downstream in my code where I was once just using the most recent value.

That should, I think, smooth out those jiggles nicely, but it strikes me that it's probably quite inefficient, and this is probably one of those Known Problems to Proper Programmers to which there's a really neat Clever Math solution.

I am, however, one of those awful self-taught programmers without a shred of formal education in anything even vaguely related to CompSci or Math. Reading around a bit suggests that this may be a high or low pass filter, but I can't find anything that explains in terms comprehensible to a hack like me what the effect of these algorithms would be on an array of values, let alone how the math works. The answer given here, for instance, technically does answer my question, but only in terms comprehensible to those who would probably already know how to solve the problem.

It would be a very lovely and clever person indeed who could explain the sort of problem this is, and how the solutions work, in terms understandable to an Arts graduate.

like image 225
Henry Cooke Avatar asked Sep 21 '10 13:09

Henry Cooke


People also ask

Which is better moving average or exponential smoothing?

Since EMAs place a higher weighting on recent data than on older data, they are more reactive to the latest price changes than SMAs are, which makes the results from EMAs more timely and explains why the EMA is the preferred average among many traders.

Which method is best for smoothing of data?

The simple exponential method is a popular data smoothing method because of the ease of calculation, flexibility, and good performance.

What advantages as a forecasting tool does exponential smoothing have over moving averages?

Whereas in Moving Averages the past observations are weighted equally, Exponential Smoothing assigns exponentially decreasing weights as the observation get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.

When should you smooth data?

What Is Data Smoothing? Data smoothing is done by using an algorithm to remove noise from a data set. This allows important patterns to more clearly stand out. Data smoothing can be used to help predict trends, such as those found in securities prices, as well as in economic analysis.


1 Answers

If you are trying to remove the occasional odd value, a low-pass filter is the best of the three options that you have identified. Low-pass filters allow low-speed changes such as the ones caused by rotating a compass by hand, while rejecting high-speed changes such as the ones caused by bumps on the road, for example.

A moving average will probably not be sufficient, since the effects of a single "blip" in your data will affect several subsequent values, depending on the size of your moving average window.

If the odd values are easily detected, you may even be better off with a glitch-removal algorithm that completely ignores them:

if (abs(thisValue - averageOfLast10Values) > someThreshold) {     thisValue = averageOfLast10Values; } 

Here is a guick graph to illustrate:

graph comparison

The first graph is the input signal, with one unpleasant glitch. The second graph shows the effect of a 10-sample moving average. The final graph is a combination of the 10-sample average and the simple glitch detection algorithm shown above. When the glitch is detected, the 10-sample average is used instead of the actual value.

like image 172
e.James Avatar answered Sep 21 '22 21:09

e.James