Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Real-time anomaly detection

I would like to do anomaly detection in R on real-time stream of sensor data. I would like to explore use of either the Twitter anomalyDetection or anomalous.

I am trying to think of the most efficient way to do this, as some online sources suggest R is not suitable for real-time anomaly detection. See https://anomaly.io/anomaly-detection-twitter-r. Should I use the stream package to implement my own data stream source? If I do so, is there any "rule-of-thumb" as to how much data I should stream in order to have a sufficient amount of data (perhaps that is what I need to experiment with)? Is there any way of doing the anomaly detection in-database rather than in-application to speed things up?

like image 534
Ilana L Avatar asked Nov 19 '15 05:11

Ilana L


1 Answers

My experience is that if you want real time anomaly detection, you need to apply an online learning algorithm (rather than batch), ideally running on each sample as it is collected/generated. To do it, you would need to modify the existing open sources to run in online mode and adapt the model parameters for each sample that is processed. I'm not aware of an open source package that does it though. For example, if you're computing a very simple anomaly detector, using the normal distribution, all you need to do is update the mean and variance of each metric with each sample that arrives. If you want the model to be adaptive, you'll need to add a forgetting factor (e.g., exponential forgetting), and control the "memory" of the mean and variance. Another algorithm which lends itself to online learning is Holt-Winters. There is a several R-implementations of it, though you still have to make it run in online mode to be real time.

I gave a talk on this topic at the Big Data, Analytics & Applied Machine Learning - Israeli Innovation Conference last May. The video is at: https://www.youtube.com/watch?v=SrOM2z6h_RQ (DISCLAIMER: I am the chief data scientist for Anodot, a commercial company doing real time anomaly detection).

like image 125
Ira Avatar answered Sep 21 '22 00:09

Ira