Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Input to LSTM network tensorflow

I have a time series of length t (x0, ...,xt) each of the xi is a d-dimension vector i.e. xi=(x0i, x1i, ...., xdi). Thus my input X is of shape [batch_size, d]

The input for the tensorflow LSTM should be of size [batchSize, hidden_size]. My question is how should i input my time series to the LSTM. One possible solution that i thought of is to have additional weight matrix, W, of size [d,hidden_size] and to input the LSTM with X*W + B.

Is this correct or should i input something else to the netwoרk?

Thanks

like image 397
ofer-a Avatar asked Jan 28 '16 08:01

ofer-a


1 Answers

Your intuition is correct; what you need (and what you have described) is an embedding to translate your input vector to the dimension of your LSTM's input. There are three primary ways that I know of to accomplish that.

  • You could do this manually with an additional weight matrix W and bias vector b as you described.
  • You could create the weight matrix and bias vectors automatically using the linear() function from TensorFlow's rnn_cell.py library. Then pass the output of that linear layer as the input of your LSTM when you create your LSTM via the rnn_decoder() function in Tensorflow's seq2seq.py library or otherwise.
  • Or you could have Tensorflow create this embedding and hook it up to the inputs of your LSTM automatically, by creating the LSTM via the embedding_rnn_decoder() function at line 141 of the same seq2seq library. (If you trace through the code for this function without any optional arguments, you'll see that it is simply creating a linear embedding layer for the input as well as the LSTM and hooking them together.)

Unless you need access to the individual components that you're creating for some reason, I would recommend the third option to keep your code at a high level.

like image 68
Pender Avatar answered Oct 07 '22 01:10

Pender