Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras Masking for RNN with Varying Time Steps

I'm trying to fit an RNN in Keras using sequences that have varying time lengths. My data is in a Numpy array with format (sample, time, feature) = (20631, max_time, 24) where max_time is determined at run-time as the number of time steps available for the sample with the most time stamps. I've padded the beginning of each time series with 0, except for the longest one, obviously.

I've initially defined my model like so...

model = Sequential()
model.add(Masking(mask_value=0., input_shape=(max_time, 24)))
model.add(LSTM(100, input_dim=24))
model.add(Dense(2))
model.add(Activation(activate))
model.compile(loss=weibull_loglik_discrete, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, nb_epoch=100, batch_size=1000, verbose=2, validation_data=(test_x, test_y))

For completeness, here's the code for the loss function:

def weibull_loglik_discrete(y_true, ab_pred, name=None):
    y_ = y_true[:, 0]
    u_ = y_true[:, 1]
    a_ = ab_pred[:, 0]
    b_ = ab_pred[:, 1]

    hazard0 = k.pow((y_ + 1e-35) / a_, b_)
    hazard1 = k.pow((y_ + 1) / a_, b_)

    return -1 * k.mean(u_ * k.log(k.exp(hazard1 - hazard0) - 1.0) - hazard1)

And here's the code for the custom activation function:

def activate(ab):
    a = k.exp(ab[:, 0])
    b = k.softplus(ab[:, 1])

    a = k.reshape(a, (k.shape(a)[0], 1))
    b = k.reshape(b, (k.shape(b)[0], 1))

    return k.concatenate((a, b), axis=1)

When I fit the model and make some test predictions, every sample in the test set gets exactly the same prediction, which seems fishy.

Things get better if I remove the masking layer, which makes me think there's something wrong with the masking layer, but as far as I can tell, I've followed the documentation exactly.

Is there something mis-specified with the masking layer? Am I missing something else?

like image 483
John Chrysostom Avatar asked Feb 20 '17 19:02

John Chrysostom


2 Answers

The way you implemented masking should be correct. If you have data with the shape (samples, timesteps, features), and you want to mask timesteps lacking data with a zero mask of the same size as the features argument, then you add Masking(mask_value=0., input_shape=(timesteps, features)). See here: keras.io/layers/core/#masking

Your model could potentially be too simple, and/or your number of epochs could be insufficient for the model to differentiate between all of your classes. Try this model:

model = Sequential()
model.add(Masking(mask_value=0., input_shape=(max_time, 24)))
model.add(LSTM(256, input_dim=24))
model.add(Dense(1024))
model.add(Dense(2))
model.add(Activation(activate))
model.compile(loss=weibull_loglik_discrete, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, nb_epoch=100, batch_size=1000, verbose=2, validation_data=(test_x, test_y)) 

If that does not work, try doubling the epochs a few times (e.g. 200, 400) and see if that improves the results.

like image 111
Robert Valencia Avatar answered Nov 11 '22 17:11

Robert Valencia


I could not validate without actual data, but I had a similar experience with an RNN. In my case normalization solved the issue. Add a normalization layer to your model.

like image 24
vagoston Avatar answered Nov 11 '22 19:11

vagoston