Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use Theanets LSTM RNN's on my time series data?

I have a simple dataframe consisting of one column. In that column are 10320 observations (numerical). I'm simulating Time-Series data by inserting the data into a plot with a window of 200 observations each. Here is the code for plotting.

import matplotlib.pyplot as plt
from IPython import display
fig_size = plt.rcParams["figure.figsize"]
import time
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
fig, axes = plt.subplots(1,1, figsize=(19,5))
df = dframe.set_index(arange(0,len(dframe)))
std = dframe[0].std() * 6
window = 200
iterations = int(len(dframe)/window)
i = 0
dframe = dframe.set_index(arange(0,len(dframe)))
while i< iterations:
    frm = window*i
    if i == iterations:
        to = len(dframe)
    else:
        to = frm+window
    df = dframe[frm : to]
    if len(df) > 100:
        df = df.set_index(arange(0,len(df)))
        plt.gca().cla() 
        plt.plot(df.index, df[0])
        plt.axhline(y=std, xmin=0, xmax=len(df[0]),c='gray',linestyle='--',lw = 2, hold=None)
        plt.axhline(y=-std , xmin=0, xmax=len(df[0]),c='gray',linestyle='--', lw = 2, hold=None)
        plt.ylim(min(dframe[0])- 0.5 , max(dframe[0]) )
        plt.xlim(-50,window+50)
        display.clear_output(wait=True)
        display.display(plt.gcf()) 
        canvas = FigureCanvas(fig)
        canvas.print_figure('fig.png', dpi=72, bbox_inches='tight')
    i += 1
plt.close()

This simulates a flow of real-time data and visualizes it. What I want is to apply theanets RNN LSTM to the data to detect anomalies unsupervised. Because I am doing it unsupervised I don't think that I need to split my data into training and test sets. I haven't found much of anything that makes sense to me so far and have been googling for about 2 hours. Just hoping that you guys may be able to help. I want to put the prediction output of the RNN on the graph as well and define a threshold that, if the error is too large, the values will be identified as anomalous. If you need more information please comment and let me know. Thank you!

like image 313
Ravaal Avatar asked Mar 17 '16 16:03

Ravaal


1 Answers

READING

  1. Like neurons, LSTM networks are build of interconnected LSTM Blocks whose training is done via BackPropogation Through Time.
  2. Classical anomaly detection using time series required prediction of time series output in future (at one or more points) and finding error on these points with true values. Prediction Error above a threshold will reflect and amomly

SOLUTION

Having said this

  1. You've to train network so you need training sets and test sets both
  2. Use N inputs to predict M outputs (decide upon N and M with experimentation - values for which training error is low)
  3. Scroll a window of (N+M) elements in input data and use this data array of (N+M) items also termed as frame to train or test network.
  4. Typically we use 90% of starting series for training and 10% for testing.

This scheme will fail as if training is not proper there will be false prediction errors which are not-anomaly. So make sure to provide enough training, and most important shuffle training frames and consider all variations.

like image 191
SACn Avatar answered Nov 15 '22 12:11

SACn