I am working on a generative chatbot based on seq2seq in Keras. I used code from this site: https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/
My models looks like this:
# define training encoder
encoder_inputs = Input(shape=(None, n_input))
encoder = LSTM(n_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
# define training decoder
decoder_inputs = Input(shape=(None, n_output))
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(n_output, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)
# define inference decoder
decoder_state_input_h = Input(shape=(n_units,))
decoder_state_input_c = Input(shape=(n_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs [decoder_outputs] + decoder_states)
This neural network is designed to work with one hot encoded vectors, and input to this network seems for example like this:
[[[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.]]]
How can I rebuild these models to work with words? I would like to use word embedding layer, but I have no idea how to connect embedding layer to these models.
My input should be [[1,5,6,7,4], [4,5,7,5,4], [7,5,4,2,1]]
where int numbers are representations of words.
I tried everything but I'm still getting errors. Can you help me, please?
Sequence-to-Sequence (Seq2Seq) modelling is about training the models that can convert sequences from one domain to sequences of another domain, for example, English to French. This Seq2Seq modelling is performed by the LSTM encoder and decoder. We can guess this process from the below illustration.
A Sequence to Sequence network, or seq2seq network, or Encoder Decoder network, is a model consisting of two RNNs called the encoder and decoder. The encoder reads an input sequence and outputs a single vector, and the decoder reads that vector to produce an output sequence.
A Seq2Seq model is a model that takes a sequence of items (words, letters, time series, etc) and outputs another sequence of items. In the case of Neural Machine Translation, the input is a series of words, and the output is the translated series of words.
I finally done it. Here is the code:
Shared_Embedding = Embedding(output_dim=embedding, input_dim=vocab_size, name="Embedding")
encoder_inputs = Input(shape=(sentenceLength,), name="Encoder_input")
encoder = LSTM(n_units, return_state=True, name='Encoder_lstm')
word_embedding_context = Shared_Embedding(encoder_inputs)
encoder_outputs, state_h, state_c = encoder(word_embedding_context)
encoder_states = [state_h, state_c]
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True, name="Decoder_lstm")
decoder_inputs = Input(shape=(sentenceLength,), name="Decoder_input")
word_embedding_answer = Shared_Embedding(decoder_inputs)
decoder_outputs, _, _ = decoder_lstm(word_embedding_answer, initial_state=encoder_states)
decoder_dense = Dense(vocab_size, activation='softmax', name="Dense_layer")
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
encoder_model = Model(encoder_inputs, encoder_states)
decoder_state_input_h = Input(shape=(n_units,), name="H_state_input")
decoder_state_input_c = Input(shape=(n_units,), name="C_state_input")
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(word_embedding_answer, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
"model" is training model encoder_model and decoder_model are inference models
Below in the FAQ section of this example, they provide an example on how to use embeddings with seq2seq. I'm currently figuring out the inference step myself. I'll post here when i get it. https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With