LSTM Autoencoder on timeseries

Tags:

I'm currently trying to implement an LSTM autoencoder to be used in order allow compression of transactions timeseries (Berka dataset) into a smaller encoded vector. The data I'm working with looks like this (it's the cumulative balance of a single account throughout time).

I decided to use Keras, and I tried to create a simple autoencoder following this tutorial. The model doesn't work.

My code is this:

import keras
from keras import Input, Model
from keras.layers import Lambda, LSTM, RepeatVector
from matplotlib import pyplot as plt
from scipy import io
from sklearn.preprocessing import MinMaxScaler
import numpy as np

class ResultPlotter(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        plt.subplots(2, 2, figsize=(10, 3))
        indexes = np.random.randint(datapoints, size=4)
        for i in range(4):
            plt.subplot(2, 2, i+1)
            plt.plot(sparse_balances[indexes[i]])
            result = sequence_autoencoder.predict(sparse_balances[0:1])
            plt.plot(result.T)
            plt.xticks([])
            plt.yticks([])
        plt.tight_layout()
        plt.show()
        return

result_plotter = ResultPlotter()

sparse_balances = io.mmread("my_path_to_sparse_balances.mtx")
sparse_balances = sparse_balances.todense()
scaler = MinMaxScaler(feature_range=(0, 1))
sparse_balances = scaler.fit_transform(sparse_balances)

N = sparse_balances.shape[0]
D = sparse_balances.shape[1]


batch_num = 32
timesteps = 500
latent_dim = 32
datapoints = N

model_inputs = Input(shape=(timesteps,))
inputs = Lambda(lambda x: keras.backend.expand_dims(x, -1))(model_inputs)
encoded = LSTM(latent_dim)(inputs)
decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(1, return_sequences=True)(decoded)
decoded = Lambda(lambda x: keras.backend.squeeze(x, -1))(decoded)
sequence_autoencoder = Model(model_inputs, decoded)
encoder = Model(model_inputs, encoded)

earlyStopping = keras.callbacks.EarlyStopping(monitor='loss', patience=5, verbose=0, mode='auto')

sequence_autoencoder.compile(loss='mean_squared_error', optimizer='adam')

sequence_autoencoder.fit(sparse_balances[:datapoints], sparse_balances[:datapoints],
                         batch_size=batch_num, epochs=100,
                        callbacks=[earlyStopping, result_plotter])

I'm not adding the code for generating the sparse_balanced.mtx in order to keep everything clear, feel free to ask for it and I will post it.

The problem is that the autoencoder seems to get stuck on predicting a single line, instead of returning outputs that closely follow the trend of the input, but after an extensive research I still have to find a solution. I did some experiments using a dense layer as the latent-to-output part of the model, and it's able to return much better results.

The question then is: given the fact that by using LSTM->Dense or Dense->Dense autoencoders I'm able to get decent results, and using Dense->LSTM and LSTM->LSTM results in the same bad predictions, is the problem in my model, in the concept or elsewhere?

Every comment is much appreciated, thanks.

476

asked Feb 06 '18 15:02

HitLuca

1 Answers

The problem was that my dataset is too niche to be easily autoencoded by LSTMs. I am currently writing my master thesis on the topic of transactions generation, and I analyzed this problem in detail. If you are not working with this dataset in particular, I suggest to try with some synthetic time-related data, such as sine waves, sawtooth waves etc. as the model should be able to work correctly on that. If it still doesn't work probably you have some bugs in your code.

answered Sep 24 '22 01:09

HitLuca

Related questions
                            
                                Avoiding select after flush when assigning to child relationship
                            
                                How do I find the number of cores available to MPI(4PY)?
                            
                                Maximum Limit of distinct fake data using Python Faker package
                            
                                Explanation for a code needed : Appending the values to list in python [duplicate]
                            
                                Exscript: How do I switch between interactive and non interactive sessions?
                            
                                How to iterate through this text file faster?
                            
                                sqlalchemy.exc.ResourceClosedError: This transaction is closed
                            
                                How to show hidden layer outputs in Tensorflow
                            
                                Sort and Compare List of Dicts Python
                            
                                Python multithreading - memory not released when ran using While statement
                            
                                Using openpyxl to refresh pivot tables in Excle
                            
                                Python HTML real time plotting
                            
                                matplotlib plot_surface 3D plot with non-linear color map
                            
                                How to Download only the first x bytes of data Python
                            
                                How to avoid SSL issues when using proxpy?
                            
                                How to obtain the convex curve for weights vs loss in a neural network [closed]
                            
                                Can pytest be made to fail if nothing is asserted?
                            
                                How to open <del>named pipe</del>character device special file for reading and writing in Python
                            
                                Include .kv/.json files while packaing kivy with PyInstaller --onefile?
                            
                                Is there a python plotly/dash image widget that can render numpy array data?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LSTM Autoencoder on timeseries

Tags:

python

keras

lstm

time-series

autoencoder

HitLuca

People also ask

1 Answers

HitLuca

Recent Activity

Donate For Us