Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How does LSTM cell map to layers?


I'm having trouble understanding exactly the scope of an LSTM cell --how it maps to a network's layers. From Graves (2014):

Seems to me that in a single-layered network the layer = lstm cell. How does this actually work in a multilayered rnn?

Three-layer RNN Three Layer RNN


The output of the cell is h_t with no superindex indicating a specific layer. Same thing with the equations. Does each cell span across a single layer? Or does each cell span across the entire three nodes at each time step?

like image 868
xv70 Avatar asked Jul 20 '17 19:07


1 Answers

Each node with name h in Figure 1 represents one LSTM cell. Note that h_{t-1}, h{t} and h{t+1} with the same superindex are the same cell. They are just unrolled in time. However, different superindices represent different LSTM cells.

The input of a cell with superindex 2 or 3 is not just the data sample x but also output of the previous cell.

You are correct. A single-layered RNN network consists of one LSTM cell. In the multi-layer RNN case, input of an intermediate LSTM cell is output of the previous LSTM cell. In Figure 1, the data sample x is also fed along with the LSTM output.

like image 112
eaksan Avatar answered Oct 04 '22 16:10
