Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can we define one-to-one, one-to-many, many-to-one, and many-to-many LSTM neural networks in Keras? [duplicate]

I am reading this article (The Unreasonable Effectiveness of Recurrent Neural Networks) and want to understand how to express one-to-one, one-to-many, many-to-one, and many-to-many LSTM neural networks in Keras. I have read a lot about RNN and understand how LSTM NNs work, in particular vanishing gradient, LSTM cells, their outputs and states, sequence output and etc. However, I have trouble expressing all these concepts in Keras.

To start with I have created the following toy NN using LSTM layer

from keras.models import Model
from keras.layers import Input, LSTM
import numpy as np

t1 = Input(shape=(2, 3))
t2 = LSTM(1)(t1)
model = Model(inputs=t1, outputs=t2)

inp = np.array([[[1,2,3],[4,5,6]]])
model.predict(inp)

Output:

array([[ 0.0264638]], dtype=float32)

In my example I have the input shape 2 by 3. As far as I understand this means that the input is a sequence of 2 vectors and each vector has 3 features and hence my input must be a 3D tensor of shape (n_examples, 2, 3). In terms of 'sequences', the input is a sequence of length 2, and each element in the sequence is expressed by 3 features (please correct me if I am wrong). When I call predict it returns a 2-dim tensor with a single scalar. So,

Q1: Is it one-to-one or another type of LSTM network?

When we say "one/many input and one/many output"

Q2: what do we mean by "one/many input/output"? A "one/many" scalar(s), vector(s), sequence(s)..., one/many what?

Q3: Can someone give a simple working example in Keras for each type of the networks: 1-1, 1-M, M-1, and M-M?

PS: I ask multiple questions in a single thread since they are very close and related to each other.

like image 625
fade2black Avatar asked Sep 02 '18 15:09

fade2black


1 Answers

The distinction one-to-one, one-to-many, many-to-one, many-to-many is only existent in case of RNN / LSTM or networks that work on sequential ( temporal ) data, CNNs work on spatial data there this distinction does not exist. So many always involves multiple timesteps / a sequence

The different species describe the shape of input and output and its classification. For the input one means a single input quantity is classified as a closed quantity and many means a sequence of quantities ( i.e. sequence of images, sequence of words) is classified as a closed quantity. For the output one means the output is a scalar ( binary classification i.e. is a bird or is not a bird ) 0 or 1, many means output is a one-hot encoded vector with one dimension for each class ( multiclass classification i.e. is a sparrow, is a robin, ... ), for i.e. three classes 001, 010, 100 :

In the following example images and sequences of images are used as quantity that shall be classified, alternatively you could use words or ... and sequences of words ( sentences ) or ... :

one-to-one : single images ( or words,... ) are classified in single class ( binary classification ) i.e. is this a bird or not

one-to-many : single images ( or words,... ) are classified in multiple classes

many-to-one : sequence of images ( or words, ... ) is classified in single class ( binary classification of a sequence )

many-to-many : sequence of images ( or words, ... ) is classified in multiple classes

cf https://www.quora.com/How-can-I-choose-between-one-to-one-one-to-many-many-to-one-many-to-one-and-many-to-many-in-long-short-term-memory-LSTM


one-to-one ( activation=sigmoid ( default ) loss=mean_squared_error )

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
# prepare sequence
length = 5
seq = array([i/float(length) for i in range(length)])
X = seq.reshape(len(seq), 1, 1)
y = seq.reshape(len(seq), 1)
# define LSTM configuration
n_neurons = length
n_batch = length
n_epoch = 1000
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=(1, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
print(model.summary())
# train LSTM
model.fit(X, y, epochs=n_epoch, batch_size=n_batch, verbose=2)
# evaluate
result = model.predict(X, batch_size=n_batch, verbose=0)
for value in result:
    print('%.1f' % value)

source : https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/


one-to-many uses RepeatVector() to transform single quantities into a sequence what is needed for multiclass classification

def test_one_to_many(self):
        params = dict(
            input_dims=[1, 10], activation='tanh',
            return_sequences=False, output_dim=3
        ),
        number_of_times = 4
        model = Sequential()
        model.add(RepeatVector(number_of_times, input_shape=(10,)))

        model.add(LSTM(output_dim=params[0]['output_dim'],
                       activation=params[0]['activation'],
                       inner_activation='sigmoid',
                       return_sequences=True,
                       ))
        relative_error, keras_preds, coreml_preds = simple_model_eval(params, model)
        # print relative_error, '\n', keras_preds, '\n', coreml_preds, '\n'
        for i in range(len(relative_error)):
            self.assertLessEqual(relative_error[i], 0.01) 

source: https://www.programcreek.com/python/example/89689/keras.layers.RepeatVector

alternative one-to-many

model.add(RepeatVector(number_of_times, input_shape=input_shape))
model.add(LSTM(output_size, return_sequences=True))

source : Many to one and many to many LSTM examples in Keras


many-to-one, binary classification (loss=binary_crossentropy, activation=sigmoid, dimensionality of fully-connected ouput layer is 1 (Dense(1)), outputs a scalar, 0 or 1 )

model = Sequential()
model.add(Embedding(5000, 32, input_length=500))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])        
print(model.summary())
model.fit(X_train, y_train, epochs=3, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)

many-to-many, multiclass classification ( loss=sparse_categorial_crossentropy , activation=softmax, needs one-hot encoding of target, ground truth data, dimensionality of fully-connected ouput layer is 7 (Dense71)) outputs a 7-dimensional vector in that the 7 classes are one-hot encoded )

from keras.models import Sequential
from keras.layers import *

model = Sequential()
model.add(Embedding(5000, 32, input_length=500))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(7, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

cf Keras LSTM multiclass classification

Alternative many-to-many using TimeDistributed layer cf https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/ for description

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import TimeDistributed
from keras.layers import LSTM
# prepare sequence
length = 5
seq = array([i/float(length) for i in range(length)])
X = seq.reshape(1, length, 1)
y = seq.reshape(1, length, 1)
# define LSTM configuration
n_neurons = length
n_batch = 1
n_epoch = 1000
# create LSTM
model = Sequential()
model.add(LSTM(n_neurons, input_shape=(length, 1), return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mean_squared_error', optimizer='adam')
print(model.summary())
# train LSTM
model.fit(X, y, epochs=n_epoch, batch_size=n_batch, verbose=2)
# evaluate
result = model.predict(X, batch_size=n_batch, verbose=0)
for value in result[0,:,0]:
    print('%.1f' % value)

source : https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/

like image 169
ralf htp Avatar answered Nov 01 '22 11:11

ralf htp