Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time Series Breakout/Change/Disturbance Detection in R: strucchange, changepoint, BreakoutDetection, bfast, and more

I would like for this to become a sign-post for various time series breakout/change/disturbance detection methods in R. My question is to describe the motivation and differences in approaches with each of the following packages. That is, when does it make more sense to use one approach over the other, similarities/differences, etc.

Packages in question:

  • strucchange (example here)
  • changepoint (example here)
  • BreakoutDetection (link includes simple example)
  • qcc's Control Charts (tutorial here)
  • bfast
  • Perhaps (?) to a lesser extent: AnomalyDetection and mvOutlier

I am hopeful for targeted answers. Perhaps a paragraph for each method. It is easy to slap each of these across a time series but that can come at the cost of abusing/violating assumptions. There are resources that provide guidelines for ML supervised/unsupervised techniques. I (and surely others) would appreciate some guidepost/pointers around this area of time-series analysis.

like image 334
JasonAizkalns Avatar asked Mar 23 '15 17:03

JasonAizkalns


1 Answers

Two very different motivations have led to time-series analysis:

  1. Industrial quality control and detection of outliers, detecting deviations from a stable noise.
  2. Scientific understanding of trends, where the understanding of trends and of their determinants is of central importance.

Of course both are to a large extent two sides of a same coin and the detection of outliers can be important for time series cleaning before trends analysis. I will nevertheless try hereafter to use this distinction as a red line to explain the diversity of packages offered by R to study time-series.

In quality control, the stability of the mean and standard deviation are of major importance as exemplified by the history of one of the first statistical efforts to maintain industrial quality, the control chart. In this respect, qcc is a reference implementation of the most classical quality control diagrams: Shewhart quality control, cusum and EWMA charts.

The old but still active mvoutlier and the more recent AnomalyDetection focus on outliers detection. mvoutlier mainly uses the Mahalanobis distance and can work with two dimensional datasets (rasters) and even multi-dimensional datasets using using the algorithm of Filzmoser, Maronna, and Werner (2007). AnomalyDetection uses the time series decomposition to identify both local anomalies (outlyers) and global anomalies (variations not explained by seasonal patterns). and BreakoutDetection

As AnomalyDetection, BreakoutDetection have been open-sourced by twitter in 2014. BreakoutDetection, open-sourced in 2014 by Twitter, intends to detect breakouts it time series, that is groups of anomalies, using non-parametric statistics. The detection of breakouts comes very close to the detection of trends and understanding of patterns. In a similar optic, the brca package focuses on the analysis of irregularly sampled time-series, particularly to identify behavioral changes in animal movement.

Definitely shifting to determination of changes in trends changepoint implements multiple (simple) frequentist and non-parametric methods to detect single or multiple breaks in time series trends. strucchange allows to fit, plot and test trend changes using regression models. Finally, bfast builds on strucchange to analyze raster (e.g. satellite images) time series and handles missing data.

like image 178
cmbarbu Avatar answered Nov 10 '22 00:11

cmbarbu