Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use next layer's output as current layer's input by Keras?

Tags:

In text generate mission, we usually use model's last output as current input to generate next word. More generalized, I want to achieve a neural network that regards next layer's finally hidden state as current layer's input. Just like the following(what confuses me is the decoder part):

encoder-decoder

But I have read Keras document and haven't found any functions to achieve it.

Can I achieve this structure by Keras? How?

like image 937
aweight Avatar asked Mar 05 '17 10:03

aweight


2 Answers

What you are asking is an autoencoders, you can find similar structures in Keras.

But there are certain details that you should figure it out on your own. Including the padding strategy and preprocessing your input and output data. Your input cannot get dynamic input size, so you need to have a fixed length for input and outputs. I don't know what do you mean by arrows who join in one circle but I guess you can take a look at Merge layer in Keras (basically adding, concatenating, and etc.)

You probably need 4 sequential model and one final model that represent the combined structure.

One more thing, the decoder setup of LSTM (The Language Model) is not dynamic in design. In your model definition, you basically introduce a fixed inputs and outputs for it. Then you prepare the training correctly, so you don't need anything dynamic. Then during the test, you can predict each decoded word in a loop by running the model once predict the next output step and run it again for next time step and so on.

like image 134
Mehdi Avatar answered Sep 22 '22 10:09

Mehdi


The structure you have showed is a custom structure. So, Keras doesn't provide any class or wrapper to directly build such structure. But YES, you can build this kind of structure in Keras.

So, it looks like you need LSTM model in backward direction. I didn't understand the other part which probably looks like incorporating previous sentence embedding as input to the next time-step input of LSTM unit.

I rather encourage you to work with simple language-modeling with LSTM first. Then you can tweak the architecture later to build an architecture depicted in figure.

Example:

  • Text generation with LSTM in Keras
like image 43
Wasi Ahmad Avatar answered Sep 24 '22 10:09

Wasi Ahmad