How to update a keras LSTM weights to avoid Concept Drift

Question

I´m trying to update a Keras LSTM to avoid the concept of drift. For that I´m following the approach proposed in this paper [1] on which they compute an anomaly score and they apply it to update the network weights. In the paper they use the L2 norm to compute the anomaly score and then they update the model weights. As it is stated in the paper:

RNN Update: The anomaly score 𝑎𝑡 is then used to update the network W𝑡−1 to obtain W𝑡 using backpropagation through time (BPTT):

W𝑡 = W𝑡−1 − 𝜂∇𝑎𝑡(W𝑡−1) where 𝜂 is the learning rate

I’m trying to update the LSTM network weights, but although I have seen some improvements in the model performance for forecasting multi-step ahead multi-sensor data I’m not sure if the improvement is because the updates deal with the drift concept or just because the model is refitted with the newest data.

Here is an example model:

model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(n_neurons, input_shape=(n_seq, n_features)))
model.add(layers.Dense(n_pred_seq * n_features))
model.add(layers.Reshape((n_pred_seq, n_features)))
model.compile(optimizer='adam', loss='mse')

And here is the way on which I’m updating the model:

y_pred = model.predict_on_batch(x_batch)
up_y = data_y[i,]
a_score = sqrt(mean_squared_error(data_y[i,].flatten(), y_pred[0, :]))
w = model.layers[0].get_weights() #Only get weights for LSTM layer
for l in range(len(w)):
    w[l] = w[l] - (w[l]*0.001*a_score) #0.001=learning rate
model.layers[0].set_weights(w)
model.fit(x_batch, up_y, epochs=1, verbose=1)
model.reset_states()

I’m wondering if this is the correct way to update the LSTM neural network and how the BPTT is applied after updating the weights.

P.D.: I have also seen other methods to detect concept drift such as the ADWIN method from the skmultiflow package but I found this one especially interesting because it also deals with anomalies, updating the model slightly when new data with concept drift comes and almost ignoring the updates when anomalous data comes.

[1] Online Anomaly Detection with Concept Drift Adaptation using Recurrent Neural Networks Saurav, S., Malhotra, P., TV, V., Gugulothu, N., Vig, L., Agarwal, P., & Shroff, G. (2018, January). In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data (pp. 78-87). ACM.

Federico Andreoli · Accepted Answer

I personally thinks that it's a valid method. The fact that you're updating the ntework weights depends on what you're doing, so if you do it like you do it's fine.

Maybe another way to do it is to implement your own loss function and embed the anti-drift parameter into it, but it might be a little complicated.

Regarding the BPTT i think it's applied as normal, but you have different "starting points", the weights you've just updated.

Hammer. Wang · Answer

Looking at the second block of your code, I believe you are not calculating the gradient properly. Specifically, the gradient update w[l] = w[l] - (w[l]*0.001*a_score) seems to be wrong to me.

Here you are multiplying the weights and the anomaly score. However, the original gradient update equation enter image description here

means to calculate the gradient of W_{t-1} using the loss \alpha_t, it does not mean to multiply \alpha_t with W_{t-1}.

To apply the online update correctly, you just need to sample your stream sequentially and apply the model.fit() as usual.

Hope this helps.

How to update a keras LSTM weights to avoid Concept Drift

Tags:

python

tensorflow

keras

lstm

recurrent-neural-network

kevin

2 Answers

Federico Andreoli

Hammer. Wang

Recent Activity

Donate For Us

How to update a keras LSTM weights to avoid Concept Drift

Tags:

python

tensorflow

keras

lstm

recurrent-neural-network

kevin

2 Answers

Federico Andreoli

Hammer. Wang

Related questions

Recent Activity

Donate For Us