Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LSTM/RNN many to one

I have the below dataset for a chemical process comprised of 5 consecutive input vectors to produce 1 output. Each input is sampled every minute while the output os sample every 5.

dataset

While I believe the output depends on the 5 previous input vectors, than I decided to look for LSTMs for my design. After a lot of research on how should be my LSTM architecture, I concluded that I should mask some of the output sequence by zeros and only leave the last output. The final architecture is below according to my dataset:

lstm architecture

My question is: What should be my 3D input tensor parameters? E.g. [5, 5, ?]? And also what should be my "Batch size"? Should it be the quantity of my samples?

like image 292
Leb_Broth Avatar asked Sep 15 '16 15:09

Leb_Broth


People also ask

Is LSTM one to many?

2) Many to one (loss is MSE of multiple values)LSTM predicts one value, this value is concatenated and used to predict the successive value t times. The loss is the MSE of all the predicted values in the trajectory and their real values. Backpropagation is only done when the whole trajectory has been predicted.

How does RNN work along with LSTM?

Long short-term memory (LSTM) networks are an extension of RNN that extend the memory. LSTM are used as the building blocks for the layers of a RNN. LSTMs assign data “weights” which helps RNNs to either let new information in, forget information or give it importance enough to impact the output.

How does a one to many RNN work?

One-to-many sequence problems are sequence problems where the input data has one time-step, and the output contains a vector of multiple values or multiple time-steps. Thus, we have a single input and a sequence of outputs. A typical example is image captioning, where the description of an image is generated.

How many gates are there in RNN LSTM and GRU A RNN 1 LSTM 2 GRU 3 B RNN 0 LSTM 3 GRU 2 C RNN 0 LSTM 2 GRU 3 D RNN 1 LSTM 4 GRU?

A GRU has two gates, an LSTM has three gates. GRUs don't possess and internal memory ( c t c_t ct) that is different from the exposed hidden state. They don't have the output gate that is present in LSTMs.


1 Answers

Since you are going for many to one sequence modelling, you don't need to pad zeros to your output (it's not needed). The easiest thing would be to perform classification at last time-step i.e after RNN/LSTM sees the 5th input. The dimension of your 3D input tensor will be [batch_size, sequence_length, input_dimensionality], where sequence_length is 5 in your case (row 1-5, 7-11, 13-17 etc.), and input_dimensionality is also 5 (i.e. column A- E). Batch_size depends on the number of examples (also how much reliable is your data), if you have more than 10,000 examples then batch size of 30-50 should be okay (read this explanation about choosing the appropriate batch size).

like image 120
uyaseen Avatar answered Sep 24 '22 01:09

uyaseen