Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Smoothing of time-series data without smoothing out peak values in R

I have got a 3 months time series of daily data (data is recorded every 5 mins). The data is pretty noisy.
I have already tried some MA methods. They work fine and the resulting curve is fairly smooth but the problem is that the peaks are almost smoothed out.

So my question is:

Is there any method to get rid of all this noise in the graph but preserve the peak values?

I have also read something about Kalman-Filtering, but I am not sure how this works and if this is suitable for my problem.

I tried the following code:

smooth <- rollapply(PCM4 [,3], width=10, FUN=mean, align = "center", fill=NA)

I also tried some different input values for window width, which made the resulting data smoother, but also reduced the peak values which is not what I want.

data set:

DateTime            h     v     Q      T
2014-12-18 11:45:00 0.112 0.515 17.141 15.4
2014-12-18 11:50:00 0.113 0.511 17.007 15.5
2014-12-18 11:55:00 0.114 0.518 17.480 15.5

unsmoothed plot:

unsmoothed plot

smoothed plot (width=10):

enter image description here

As you see, the second plot is quite distorted and the first peak e.g. is at about 250 L/s instead of 500 L/s. The reason for this is, that it´s computed from the rolling mean, so it gets quite distorted.

But the question is: Is there any better solution to fit my needs??

like image 220
Foerbian Avatar asked Nov 01 '22 03:11

Foerbian


1 Answers

Is there any method to get rid of all this noise in the graph but preserve the peak values?

The challenge here is that you have not really said what is noise and what is signal. Normally, a wildly different ("peak") value would be classified as noise. When people say filtering, they are usually thinking of low-pass filtering (removing high frequency noise and keeping general trends). A sudden peak is going to be noise by that definition.

A Kalman Filter would give you a tool to use if you had a mathematical understanding of your system and its noise. In the KF's "predict" step you would have a mathematical model which would produce an expected value against which you would test your measurement. If you could predict peaks (either their value, or even just when they exist) a KF could help you.

An approach that might help is http://www.lifl.fr/~casiez/1euro/ the "1 Euro" filter. The core idea is that gross movements (your sudden peaks) are likely to be essentially true, while periods of low movement are noisy and should be averaged down. That filter opens up its bandwidth suddenly whenever there's a big movement, and then gradually clamps it down. It was designed for tracking human movements without reflecting the noise from the measurements.

like image 84
Ben Jackson Avatar answered Nov 09 '22 15:11

Ben Jackson