GRU/LSTM in Keras with input sequence of varying length

Tags:

I'm working on a smaller project to better understand RNN, in particualr LSTM and GRU. I'm not at all an expert, so please bear that in mind.

The problem I'm facing is given as data in the form of:

>>> import numpy as np
>>> import pandas as pd
>>> pd.DataFrame([[1, 2, 3],[1, 2, 1], [1, 3, 2],[2, 3, 1],[3, 1, 1],[3, 3, 2],[4, 3, 3]], columns=['person', 'interaction', 'group'])
   person  interaction  group
0       1            2      3
1       1            2      1
2       1            3      2
3       2            3      1
4       3            1      1
5       3            3      2
6       4            3      3

this is just for explanation. We have different person interacting with different groups in different ways. I've already encoded the various features. The last interaction of a user is always a 3, which means selecting a certain group. In the short example above person 1 chooses group 2, person 2 chooses group 1 and so on.

My whole data set is much bigger but I would like to understand first the conceptual part before throwing models at it. The task I would like to learn is given a sequence of interaction, which group is chosen by the person. A bit more concrete, I would like to have an output a list with all groups (there are 3 groups, 1, 2, 3) sorted by the most likely choice, followed by the second and third likest group. The loss function is therefore a mean reciprocal rank.

I know that in Keras Grus/LSTM can handle various length input. So my three questions are.

The input is of the format:

(samples, timesteps, features)

writing high level code:

import keras.layers as L
import keras.models as M
model_input = L.Input(shape=(?, None, 2))

timestep=None should imply the varying size and 2 is for the feature interaction and group. But what about the samples? How do I define the batches?

For the output I'm a bit puzzled how this should look like in this example? I think for each last interaction of a person I would like to have a list of length 3. Assuming I've set up the output

model_output = L.LSTM(3, return_sequences=False)

I then want to compile it. Is there a way of using the mean reciprocal rank?

model.compile('adam', '?')

I know the questions are fairly high level, but I would like to understand first the big picture and start to play around. Any help would therefore be appreciated.

750

asked Apr 02 '19 20:04

math

1 Answers

The concept you've drawn in your question is a pretty good start already. I'll add a few things to make it work, as well as a code example below:

You can specify LSTM(n_hidden, input_shape=(None, 2)) directly, instead of inserting an extra Input layer; the batch dimension is to be omitted for the definition.
Since your model is going to perform some kind of classification (based on time series data) the final layer is what we'd expect from "normal" classification as well, a Dense(num_classes, action='softmax'). Chaining the LSTM and the Dense layer together will first pass the time series input through the LSTM layer and then feed its output (determined by the number of hidden units) into the Dense layer. activation='softmax' allows to compute a class score for each class (we're going to use one-hot-encoding in a data preprocessing step, see code example below). This means class scores are not ordered, but you can always do so via np.argsort or np.argmax.
Categorical crossentropy loss is suited for comparing the classification score, so we'll use that one: model.compile(loss='categorical_crossentropy', optimizer='adam').
Since the number of interactions. i.e. the length of model input, varies from sample to sample we'll use a batch size of 1 and feed in one sample at a time.

The following is a sample implementation w.r.t to the above considerations. Note that I modified your sample data a bit, in order to provide more "reasoning" behind group choices. Also each person needs to perform at least one interaction before choosing a group (i.e. the input sequence cannot be empty); if this is not the case for your data, then introducing an additional no-op interaction (e.g. 0) can help.

import pandas as pd
import tensorflow as tf

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(10, input_shape=(None, 2)))  # LSTM for arbitrary length series.
model.add(tf.keras.layers.Dense(3, activation='softmax'))   # Softmax for class probabilities.
model.compile(loss='categorical_crossentropy', optimizer='adam')

# Example interactions:
#   * 1: Likes the group,
#   * 2: Dislikes the group,
#   * 3: Chooses the group.
df = pd.DataFrame([
    [1, 1, 3],
    [1, 1, 3],
    [1, 2, 2],
    [1, 3, 3],
    [2, 2, 1],
    [2, 2, 3],
    [2, 1, 2],
    [2, 3, 2],
    [3, 1, 1],
    [3, 1, 1],
    [3, 1, 1],
    [3, 2, 3],
    [3, 2, 2],
    [3, 3, 1]],
    columns=['person', 'interaction', 'group']
)
data = [person[1][['interaction', 'group']].values for person in df.groupby('person')]
x_train = [x[:-1] for x in data]
y_train = tf.keras.utils.to_categorical([x[-1, 1]-1 for x in data])  # Expects class labels from 0 to n (-> subtract 1).
print(x_train)
print(y_train)

class TrainGenerator(tf.keras.utils.Sequence):
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __len__(self):
        return len(self.x)

    def __getitem__(self, index):
        # Need to expand arrays to have batch size 1.
        return self.x[index][None, :, :], self.y[index][None, :]

model.fit_generator(TrainGenerator(x_train, y_train), epochs=1000)
pred = [model.predict(x[None, :, :]).ravel() for x in x_train]
for p, y in zip(pred, y_train):
    print(p, y)

And the corresponding sample output:

[...]
Epoch 1000/1000
3/3 [==============================] - 0s 40ms/step - loss: 0.0037
[0.00213619 0.00241093 0.9954529 ] [0. 0. 1.]
[0.00123938 0.99718493 0.00157572] [0. 1. 0.]
[9.9632275e-01 7.5039308e-04 2.9268670e-03] [1. 0. 0.]

Using custom generator expressions: According to the documentation we can use any generator to yield the data. The generator is expected to yield batches of the data and loop over the whole data set indefinitely. When using tf.keras.utils.Sequence we do not need to specify the parameter steps_per_epoch as this will default to len(train_generator). Hence, when using a custom generator, we shall provide this parameter as well:

import itertools as it

model.fit_generator(((x_train[i % len(x_train)][None, :, :],
                      y_train[i % len(y_train)][None, :]) for i in it.count()),
                    epochs=1000,
                    steps_per_epoch=len(x_train))

178

answered Oct 04 '22 09:10

a_guest

Related questions
                            
                                Keras - Fusion of a Dense Layer with a Convolution2D Layer
                            
                                Keras jupyter notebook outputs blocks during training
                            
                                Tensorboard Cannot find .runfiles directory error
                            
                                slice/split a layer in keras as in caffe
                            
                                How do I compute the KL divergence in Keras with TensorFlow backend?
                            
                                How to calculate vector-wise dot product in Keras?
                            
                                dump weights of cnn in json using keras
                            
                                Calculating gradient norm wrt weights with keras
                            
                                Does 1D Convolutional layer support variable sequence lengths?
                            
                                Base64 images with Keras and Google Cloud ML
                            
                                Submitting Google Cloud ML Engine Jobs from Python Directly
                            
                                Can we use tf.spectral fourier functions in keras?
                            
                                Do programs continue execution after connection is lost on google-colaboratory?
                            
                                Nested while loop in tensorflow
                            
                                How to merge multiple sequential models in Keras Python?
                            
                                What does keras normalize axis argument does?
                            
                                Multiple inputs to Keras Sequential model
                            
                                Tensorflow: ValueError: The last dimension of the inputs to `Dense` should be defined. Found `None`
                            
                                What is the meaning of the parameter 'dims' in function Permute in keras?
                            
                                Tensorflow predict the class of output

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

GRU/LSTM in Keras with input sequence of varying length

Tags:

keras

lstm

recurrent-neural-network

math

People also ask

1 Answers

a_guest

Recent Activity

Donate For Us