I am trying to build an LSTM Autoencoder to predict Time Series data. Since I am new to Python I have mistakes in the decoding part. I tried to build it up like here and Keras. I could not understand the difference between the given examples at all. The code that I have right now looks like:
Question 1: is how to choose the batch_size and input_dimension when each sample has 2000 values?
Question 2: How to get this LSTM Autoencoder working (the model and the prediction) ? This ist just the model, but how to predict? That it is predicting from the lets say starting from sample 10 on till the end of the data?
Mydata has in total 1500 samples, I would go with 10 time steps (or more if better), and each sample has 2000 Values. If you need more information I would include them as well later.
trainX = np.reshape(data, (1500, 10,2000))
from keras.layers import *
from keras.models import Model
from keras.layers import Input, LSTM, RepeatVector
parameter
timesteps=10
input_dim=2000
units=100 #choosen unit number randomly
batch_size=2000
epochs=20
Model
inpE = Input((timesteps,input_dim))
outE = LSTM(units = units, return_sequences=False)(inpE)
encoder = Model(inpE,outE)
inpD = RepeatVector(timesteps)(outE)
outD1 = LSTM(input_dim, return_sequences=True)(outD
decoder = Model(inpD,outD)
autoencoder = Model(inpE, outD)
autoencoder.compile(loss='mean_squared_error',
optimizer='rmsprop',
metrics=['accuracy'])
autoencoder.fit(trainX, trainX,
batch_size=batch_size,
epochs=epochs)
encoderPredictions = encoder.predict(trainX)
LSTM-based recurrent neural networks are probably the most powerful approach to learning from sequential data and time series are only a special case.
An LSTM Autoencoder is an implementation of an autoencoder for sequence data using an Encoder-Decoder LSTM architecture. Once fit, the encoder part of the model can be used to encode or compress sequence data that in turn may be used in data visualizations or as a feature vector input to a supervised learning model.
Using LSTM, time series forecasting models can predict future values based on previous, sequential data. This provides greater accuracy for demand forecasters which results in better decision making for the business.
The autoencoder is proved to be an effective and efficient method for extracting spatial patterns through unsupervised learning for the prediction and susceptibility assessment of landslide areas.
The LSTM model that I use is this one:
def get_model(n_dimensions):
inputs = Input(shape=(timesteps, input_dim))
encoded = LSTM(n_dimensions, return_sequences=False, name="encoder")(inputs)
decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(input_dim, return_sequences=True, name='decoder')(decoded)
autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
return autoencoder, encoder
autoencoder, encoder = get_model(n_dimensions)
autoencoder.compile(optimizer='rmsprop', loss='mse',
metrics=['acc', 'cosine_proximity'])
history = autoencoder.fit(x, x, batch_size=100, epochs=100)
encoded = encoder.predict(x)
It works with the data that have, x is of size (3000, 180, 40)
, that is 3000 samples, timesteps=180
and input_dim=40
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With