I'm trying to learn LSTM. Have taken this web courses, read this book (https://machinelearningmastery.com/lstms-with-python/) and a lot of blogs... But, I'm completely stuck. My interest is in multivariate LSTM's and I have read all I can find but still can't get it. Don't know if I'm stupid or what it is... If this exact question and a good answer already exists then I am sorry for double posting but I have looked and haven't found it... As I want to really know the basics I created a dummy dataset in excel where every "y" depends on the sum of each input x1 and x2 but also over time. As I understand it this is a many-to-one scenario. Pseudo code: <pre class="prettyprint"><code>x1(t) = sin(A(t)) x2(t) = cos(A(t)) tmp(t) = x1(t) + x2(t) (dummy variable) y(t) = tmp(t) + tmp(t-1) + tmp(t-2) (i.e. sum over the last three steps) </code></pre> (Basically I want to predict y(t) given x1 and x2 over three time steps) This is then exported to a csv file with columns x1, x2, y I have tried to code it up below but obviously it won't work. I read the data and split it into a 80/20 test and train set as X_train, y_train, X_test, y_test with dimensions (217,2), (217,1), (54,2), (54/1) What I really haven't got a grip on yet is what exactly are timesteps and samples and the use in reshape and input_shape. In many examples of code I have looked at they simply use numbers rather than defined variables which makes it very difficult to understand what is happening, especially if you want to change something. As an example, in one of the courses I took the reshaping was coded like this... <pre class="prettyprint"><code>X_train = np.reshape(X_train, (1257, 1, 1)) </code></pre> This doesn't provide much info... Anyway, when i run the code below it says <blockquote> ValueError: cannot reshape array of size 434 into shape (217,3,2) </blockquote> So, I know what the causes the error, but not what I need to do to fix it. If I set look_back=1 it works but that's not what I want. <pre class="prettyprint"><code>import numpy as np import pandas as pd from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense # Load data data_set = pd.read_csv('../Data/LSTM_test.csv',';') """ data loaded have three columns: col 0, col 1: features (x) col 2: y """ # Train/test and variable split split = 0.8 # 80% train, 20% test split_idx = int(data_set.shape[0]*split) # ...train X_train = data_set.values[0:split_idx,0:2] y_train = data_set.values[0:split_idx,2] # ...test X_test = data_set.values[split_idx:-1,0:2] y_test = data_set.values[split_idx:-1,2] # Model setup look_back = 3 # as that is how y was generated (i.e. sum last three steps) num_features = 2 # in this case: 2 features x1, x2 output_dim = 1 # want to predict 1 y value nb_hidden_neurons = 32 # assume something to start with nb_epoch = 2 # assume something to start with # Reshaping nb_samples = len(X_train) # in this case 217 samples in the training set X_train_reshaped = np.reshape(X_train,(nb_samples, look_back, num_features)) # Create model model = Sequential() model.add(LSTM(nb_hidden_neurons, input_shape=(look_back,num_features))) model.add(Dense(units=output_dim)) model.compile(optimizer = 'adam', loss = 'mean_squared_error') model.fit(X_train_reshaped, y_train, batch_size = 32, epochs = nb_epoch) print(model.summary()) </code></pre> Can anyone please explain what I have done wrong? As I said, I have read a lot of blogs, questions, tutorials etc but if someone has a particularly good source of info I'd love to check that one up too.

I also had this question before. On a higher level, in <code>(samples, time steps, features)</code> <ol> <li> <code>samples</code> are the number of data, or say how many rows are there in your data set</li> <li> <code>time step</code> is the number of times to feed in the model or <code>LSTM</code> </li> <li> <code>features</code> is the number of columns of each sample</li> </ol> For me, I think a better example to understand it is that in <code>NLP</code>, suppose you have a sentence to process, then here sample is 1, which means 1 sentence to read, <code>time step</code> is the number of words in that sentence, you feed in the sentence word by word before the model read all the words and get a whole context of that sentence, <code>features</code> here is the dimension of each word, because in word embedding like <code>word2vec</code> or <code>glove</code>, each word is interpreted by a vector with multiple dimensions. The <code>input_shape</code> parameter in <code>Keras</code> is only <code>(time_steps, num_features)</code>, more you can refer to this. And the problem of yours is that when you reshape data, the multiplication of each dimension should equal to the multiplication of dimensions of original data set, where 434 does not equal to 217*3*2. When you implement <code>LSTM</code>, you should be very clear of what are the features and what are the element you want the model to read each time step. There is a very similar case here surely can help you. For example, if you are trying to predict the value of time <code>t</code> using <code>t-1</code> and <code>t-2</code>, you can either choose to feed in two values as one element to predict <code>t</code>, where <code>(time_step, num_features)=(1, 2)</code>, or you can feed each value in 2 time steps, where <code>(time_step, num_features)=(2, 1)</code>. That's basically how I understand this, hope make it clear for you.

LSTM: Understand timesteps, samples and features and especially the use in reshape and input_shape

Tags:

python

keras

lstm

I'm trying to learn LSTM. Have taken this web courses, read this book (https://machinelearningmastery.com/lstms-with-python/) and a lot of blogs... But, I'm completely stuck. My interest is in multivariate LSTM's and I have read all I can find but still can't get it. Don't know if I'm stupid or what it is...

If this exact question and a good answer already exists then I am sorry for double posting but I have looked and haven't found it...

As I want to really know the basics I created a dummy dataset in excel where every "y" depends on the sum of each input x1 and x2 but also over time. As I understand it this is a many-to-one scenario. Pseudo code:

x1(t) = sin(A(t))
x2(t) = cos(A(t))
tmp(t) = x1(t) + x2(t)         (dummy variable)
y(t) = tmp(t) + tmp(t-1) + tmp(t-2)     (i.e. sum over the last three steps)

(Basically I want to predict y(t) given x1 and x2 over three time steps)

This is then exported to a csv file with columns x1, x2, y

I have tried to code it up below but obviously it won't work.

I read the data and split it into a 80/20 test and train set as X_train, y_train, X_test, y_test with dimensions (217,2), (217,1), (54,2), (54/1)

What I really haven't got a grip on yet is what exactly are timesteps and samples and the use in reshape and input_shape. In many examples of code I have looked at they simply use numbers rather than defined variables which makes it very difficult to understand what is happening, especially if you want to change something. As an example, in one of the courses I took the reshaping was coded like this...

X_train = np.reshape(X_train, (1257, 1, 1))

This doesn't provide much info...

Anyway, when i run the code below it says

ValueError: cannot reshape array of size 434 into shape (217,3,2)

So, I know what the causes the error, but not what I need to do to fix it. If I set look_back=1 it works but that's not what I want.

import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# Load data
data_set = pd.read_csv('../Data/LSTM_test.csv',';')
"""
data loaded have three columns:
    col 0, col 1: features (x)
    col 2: y
"""

# Train/test and variable split
split = 0.8 # 80% train, 20% test
split_idx = int(data_set.shape[0]*split)

# ...train
X_train = data_set.values[0:split_idx,0:2]
y_train = data_set.values[0:split_idx,2]

# ...test
X_test = data_set.values[split_idx:-1,0:2]
y_test = data_set.values[split_idx:-1,2]

# Model setup
look_back = 3 # as that is how y was generated (i.e. sum last three steps)
num_features = 2 # in this case: 2 features x1, x2
output_dim = 1 # want to predict 1 y value

nb_hidden_neurons = 32 # assume something to start with
nb_epoch = 2 # assume something to start with

# Reshaping
nb_samples = len(X_train) # in this case 217 samples in the training set
X_train_reshaped = np.reshape(X_train,(nb_samples, look_back, num_features))

# Create model
model = Sequential()
model.add(LSTM(nb_hidden_neurons, input_shape=(look_back,num_features)))
model.add(Dense(units=output_dim))
model.compile(optimizer = 'adam', loss = 'mean_squared_error')

model.fit(X_train_reshaped, y_train, batch_size = 32, epochs = nb_epoch)
print(model.summary())

Can anyone please explain what I have done wrong?

As I said, I have read a lot of blogs, questions, tutorials etc but if someone has a particularly good source of info I'd love to check that one up too.

394

asked Aug 01 '17 10:08

DBSE

1 Answers

I also had this question before. On a higher level, in (samples, time steps, features)

samples are the number of data, or say how many rows are there in your data set
time step is the number of times to feed in the model or LSTM
features is the number of columns of each sample

For me, I think a better example to understand it is that in NLP, suppose you have a sentence to process, then here sample is 1, which means 1 sentence to read, time step is the number of words in that sentence, you feed in the sentence word by word before the model read all the words and get a whole context of that sentence, features here is the dimension of each word, because in word embedding like word2vec or glove, each word is interpreted by a vector with multiple dimensions.

The input_shape parameter in Keras is only (time_steps, num_features), more you can refer to this.

And the problem of yours is that when you reshape data, the multiplication of each dimension should equal to the multiplication of dimensions of original data set, where 434 does not equal to 217*3*2.

When you implement LSTM, you should be very clear of what are the features and what are the element you want the model to read each time step. There is a very similar case here surely can help you. For example, if you are trying to predict the value of time t using t-1 and t-2, you can either choose to feed in two values as one element to predict t, where (time_step, num_features)=(1, 2), or you can feed each value in 2 time steps, where (time_step, num_features)=(2, 1).

That's basically how I understand this, hope make it clear for you.

answered Nov 15 '22 20:11

MJeremy

Related questions
                            
                                Alternating row color using xlsxwriter in Python 3
                            
                                zip the values from a dictionary [duplicate]
                            
                                Python :unit test throws <Response streamed [200 OK]> instead of actual output
                            
                                django.db.migrations.exceptions.CircularDependencyError
                            
                                Split output of a layer in keras
                            
                                Convert API to Pandas DataFrame
                            
                                Why don't f-strings change when variables they reference change?
                            
                                Outer product of each column of a 2D array to form a 3D array - NumPy
                            
                                What do the functions tf.squeeze and tf.nn.rnn do?
                            
                                Environment specific pip.conf under anaconda
                            
                                Hiding and showing a widget in Kivy
                            
                                How do I have a "press enter to continue" feature in python? [duplicate]
                            
                                sqlalchemy print results instead of objects
                            
                                pip install mod_wsgi, How to Set MOD_WSGI_APACHE_ROOTDIR environment?
                            
                                ImportError: No module named googleapiclient.discovery
                            
                                How does paging work in the list_blobs function in Google Cloud Storage Python Client Library
                            
                                Is LASSO regression implemented in Statsmodels?
                            
                                Import CSV to database using sqlalchemy
                            
                                In method call args, how to override keyword argument of unpacked dict?
                            
                                mypy: how to define a generic subclass

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With