I've checked the source code for both functions, and it seems that LSTM() makes the LSTM network in general, while LSTMCell() only returns one cell.
However, in most cases people only use one LSTM Cell in their program. Does this mean when you have only one LSTM Cell (ex. in simple Seq2Seq), calling LSTMCell() and LSTM() would make no difference?
LSTMCell takes ONE input x_t. You need to make a loop in order to do one pass of backprop through time. LSTM takes a SEQUENCE of inputs x_1,x_2,…,x_T. No need to write a loop to do one pass of backprop through time.
A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.
LSTM
is a recurrent layer LSTMCell
is an object (which happens to be a layer too) used by the LSTM layer that contains the calculation logic for one step.A recurrent layer contains a cell object. The cell contains the core code for the calculations of each step, while the recurrent layer commands the cell and performs the actual recurrent calculations.
Usually, people use LSTM
layers in their code.
Or they use RNN
layers containing LSTMCell
.
Both things are almost the same. An LSTM
layer is a RNN
layer using an LSTMCell
, as you can check out in the source code.
About the number of cells:
Alghout it seems, because of its name, that LSTMCell
is a single cell, it is actually an object that manages all the units/cells as we may think. In the same code mentioned, you can see that the units
argument is used when creating an instance of LSTMCell
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With