Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between LSTM() and LSTMCell()?

I've checked the source code for both functions, and it seems that LSTM() makes the LSTM network in general, while LSTMCell() only returns one cell.

However, in most cases people only use one LSTM Cell in their program. Does this mean when you have only one LSTM Cell (ex. in simple Seq2Seq), calling LSTMCell() and LSTM() would make no difference?

like image 555
narutatsuri Avatar asked Jan 10 '18 12:01

narutatsuri


People also ask

What is the difference between LSTM and LSTMCell in pytorch?

LSTMCell takes ONE input x_t. You need to make a loop in order to do one pass of backprop through time. LSTM takes a SEQUENCE of inputs x_1,x_2,…,x_T. No need to write a loop to do one pass of backprop through time.

What is a cell in LSTM?

A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.


1 Answers

  • LSTM is a recurrent layer
  • LSTMCell is an object (which happens to be a layer too) used by the LSTM layer that contains the calculation logic for one step.

A recurrent layer contains a cell object. The cell contains the core code for the calculations of each step, while the recurrent layer commands the cell and performs the actual recurrent calculations.

Usually, people use LSTM layers in their code.
Or they use RNN layers containing LSTMCell.

Both things are almost the same. An LSTM layer is a RNN layer using an LSTMCell, as you can check out in the source code.

About the number of cells:

Alghout it seems, because of its name, that LSTMCell is a single cell, it is actually an object that manages all the units/cells as we may think. In the same code mentioned, you can see that the units argument is used when creating an instance of LSTMCell.

like image 74
Daniel Möller Avatar answered Oct 24 '22 18:10

Daniel Möller