Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Autofilter for Time Series in Python/Keras using Conv1d

It may looks like a lot of code, but most of the code is comments or formatting to make it more readable.

Given:
If I define my variable of interest, "sequence", as follows:

# define input sequence
np.random.seed(988) 

#make numbers 1 to 100
sequence = np.arange(0,10, dtype=np.float16)

#shuffle the numbers
sequence = sequence[np.random.permutation(len(sequence))]

#augment the sequence with itself
sequence = np.tile(sequence,[15]).flatten().transpose()

#scale for Relu
sequence = (sequence - sequence.min()) / (sequence.max()-sequence.min())

sequence

# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

Question:
How do I use the conv1d in an autoencoder in Keras to estimate this sequence with a reasonable level of accuracy?

If conv1d is not appropriate for this problem, can you tell me what the more appropriate layer-type for the encoder-decoder might be?

More information:
Points about the data:

  • it is a repeating sequence of 10 distinct values
  • a single lag of 10 steps should perfectly predict the sequence
  • a dictionary of 10 elements should give a "predict the next given this"

I had tried other layers the encoder and decoder portions (LSTM, Dense, multilayer dense) to predict, and they kept hitting a "wall" at around mse of 0.0833... which is the variance of a uniform distribution ranged between 0 and 1. To me, a good autoencoder on this simple of a problem should be able to get at least 99.9% accurate, so a 'mse' substantially below 1%.

I haven't been able to get the conv1d to work because I am messing up the inputs. There seem to be no really great examples of how to make it work, and I am new enough to this overall architecture that it isn't being apparent to me.

Links:

  • https://ramhiser.com/post/2018-05-14-autoencoders-with-keras/
  • https://machinelearningmastery.com/lstm-autoencoders/
like image 798
EngrStudent Avatar asked Mar 03 '23 22:03

EngrStudent


1 Answers

By creating a dataset of 1000 sample using your method i was able to get a pretty good autoencoder model using Conv1d :

LEN_SEQ = 10

x = Input(shape=(n_in, 1), name="input")
h = Conv1D(filters=50, kernel_size=LEN_SEQ, activation="relu", padding='same', name='Conv1')(x)
h = MaxPooling1D(pool_size=2, name='Maxpool1')(h)
h = Conv1D(filters=150, kernel_size=LEN_SEQ, activation="relu", padding='same', name='Conv2')(h)
h = MaxPooling1D(pool_size=2,  name="Maxpool2")(h)
y = Conv1D(filters=150, kernel_size=LEN_SEQ, activation="relu", padding='same', name='conv-decode1')(h)
y = UpSampling1D(size=2, name='upsampling1')(y)
y = Conv1D(filters=50, kernel_size=LEN_SEQ, activation="relu", padding='same', name='conv-decode2')(y)
y = UpSampling1D(size=2, name='upsampling2')(y)
y = Conv1D(filters=1, kernel_size=LEN_SEQ, activation="relu", padding='same', name='conv-decode3')(y)

AutoEncoder = Model(inputs=x, outputs=y, name='AutoEncoder')

AutoEncoder.compile(optimizer='adadelta', loss='mse')

AutoEncoder.fit(sequence, sequence, batch_size=32, epochs=50)

Last epoch output :

Epoch 50/50
1000/1000 [==============================] - 4s 4ms/step - loss: 0.0104

Test on new data :

array([[[0.5557],
        [0.8887],
        [0.778 ],
        [0.    ],
        [0.4443],
        [1.    ],
        [0.3333],
        [0.2222],
        [0.1111],
        [0.6665],
        [...]

Predictions :

array([[[0.56822747],
        [0.8906583 ],
        [0.89267206],
        [0.        ],
        [0.5023574 ],
        [1.0665314 ],
        [0.37099048],
        [0.28558862],
        [0.05782872],
        [0.6886021 ],
        [...]

Some rounding problems, but pretty close !

Is it what you were looking for ?

like image 120
Thibault Bacqueyrisses Avatar answered Mar 12 '23 06:03

Thibault Bacqueyrisses