I've got a question on Tensorflow LSTM-Implementation. There are currently several implementations in TF, but I use: <pre class="prettyprint lang-py prettyprint-override"><code>cell = tf.contrib.rnn.BasicLSTMCell(n_units) </code></pre> <ul> <li>where n_units is the amount of 'parallel' LSTM-Cells.</li> </ul> Then to get my output I call: <pre class="prettyprint lang-py prettyprint-override"><code> rnn_outputs, rnn_states = tf.nn.dynamic_rnn(cell, x, initial_state=initial_state, time_major=False) </code></pre> <ul> <li>where (as <code>time_major=False</code>) <code>x</code> is of shape <code>(batch_size, time_steps, input_length)</code> </li> <li>where <code>batch_size</code> is my batch_size</li> <li>where <code>time_steps</code> is the amount of timesteps my RNN will go through</li> <li>where <code>input_length</code> is the length of one of my input vectors (vector fed into the network on one specific timestep on one specific batch)</li> </ul> I expect rnn_outputs to be of shape <code>(batch_size, time_steps, n_units, input_length)</code> as I have not specified another output size. Documentation of <code>nn.dynamic_rnn</code> tells me that output is of shape <code>(batch_size, input_length, cell.output_size)</code>. The documentation of <code>tf.contrib.rnn.BasicLSTMCell</code> does have a property <code>output_size</code>, which is defaulted to n_units (the amount of LSTM-cells I use). So does each LSTM-Cell only output a scalar for every given timestep? I would expect it to output a vector of the length of the input vector. This seems not to be the case from how I understand it right now, so I am confused. Can you tell me whether that's the case or how I could change it to output a vector of size of the input vector per single lstm-cell maybe?

I think the primary confusion is on the terminology of the LSTM cell's argument: <code>num_units</code>. Unfortunately it doesn't mean, as the name suggests, "the no. of LSTM cells" that should be equal to your time-steps. They actually correspond to the number of dimensions in the hidden state (cell state + hidden state vector). The call to <code>dynamic_rnn()</code> returns a tensor of shape: <code>[batch_size, time_steps, output_size]</code> where, (Please note this) output_size = num_units; if (num_proj = None) in the lstm cell where as, output_size = num_proj; if it is defined. Now, typically, you will extract the last time_step's result and project it to the size of output dimensions using a <code>mat-mul + biases</code> operation manually, or use the num_proj argument in the LSTM cell. I have been through the same confusion and had to look really deep to get it cleared. Hope this answer clears some of it.

Output of Tensorflow LSTM-Cell

Tags:

python

output

tensorflow

lstm

I've got a question on Tensorflow LSTM-Implementation. There are currently several implementations in TF, but I use:

cell = tf.contrib.rnn.BasicLSTMCell(n_units)

where n_units is the amount of 'parallel' LSTM-Cells.

Then to get my output I call:

 rnn_outputs, rnn_states = tf.nn.dynamic_rnn(cell, x,
                        initial_state=initial_state, time_major=False)

where (as time_major=False) x is of shape (batch_size, time_steps, input_length)
where batch_size is my batch_size
where time_steps is the amount of timesteps my RNN will go through
where input_length is the length of one of my input vectors (vector fed into the network on one specific timestep on one specific batch)

I expect rnn_outputs to be of shape (batch_size, time_steps, n_units, input_length) as I have not specified another output size. Documentation of nn.dynamic_rnn tells me that output is of shape (batch_size, input_length, cell.output_size). The documentation of tf.contrib.rnn.BasicLSTMCell does have a property output_size, which is defaulted to n_units (the amount of LSTM-cells I use).

So does each LSTM-Cell only output a scalar for every given timestep? I would expect it to output a vector of the length of the input vector. This seems not to be the case from how I understand it right now, so I am confused. Can you tell me whether that's the case or how I could change it to output a vector of size of the input vector per single lstm-cell maybe?

858

asked Feb 26 '17 15:02

LJKS

1 Answers

I think the primary confusion is on the terminology of the LSTM cell's argument: num_units. Unfortunately it doesn't mean, as the name suggests, "the no. of LSTM cells" that should be equal to your time-steps. They actually correspond to the number of dimensions in the hidden state (cell state + hidden state vector). The call to dynamic_rnn() returns a tensor of shape: [batch_size, time_steps, output_size] where,

(Please note this) output_size = num_units; if (num_proj = None) in the lstm cell
where as, output_size = num_proj; if it is defined.

Now, typically, you will extract the last time_step's result and project it to the size of output dimensions using a mat-mul + biases operation manually, or use the num_proj argument in the LSTM cell.
I have been through the same confusion and had to look really deep to get it cleared. Hope this answer clears some of it.

186

answered Sep 20 '22 10:09

Animesh Karnewar

Related questions
                            
                                Python - read 1000 lines from a file at a time
                            
                                Matplotlib 3D surface plot from 2D pandas dataframe
                            
                                Google-oauth inside Jupyter Notebook
                            
                                GIMP on Windows - executing a python-fu script from the command line
                            
                                Prevent pandas from reading None as Nan
                            
                                Error in use of python multiprocessing module with generator function.
                            
                                Using pandas DataFrame.eval function to alter subset of rows inplace
                            
                                Cython Pass String To C as Bytes
                            
                                How can I check Delivery Status of campaign on Facebook Marketing API
                            
                                replace string if length is less than x
                            
                                Debug PyThread_acquire_lock deadlock
                            
                                Advanced input in python
                            
                                Selenium wait for either of 2 conditions to be present
                            
                                Find daily observation closest to specific time for irregularly spaced data
                            
                                Matplotlib 2 inconsistent font
                            
                                How to clear cache on building Gitlab?
                            
                                Python Interpreter String Pooling Optimization [duplicate]
                            
                                PyAudio cannot find any output devices
                            
                                Installing psycopg2 in an alpine docker container
                            
                                Return second smallest number in a nested list using recursion

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With