LSTM Initial state from Dense layer

Tags:

I am using a lstm on time series data. I have features about the time series that are not time dependent. Imagine company stocks for the series and stuff like company location in the non-time series features. This is not the usecase, but it is the same idea. For this example, let's just predict the next value in the time series.

So a simple example would be:

feature_input = Input(shape=(None, data.training_features.shape[1]))
dense_1 = Dense(4, activation='relu')(feature_input)
dense_2 = Dense(8, activation='relu')(dense_1)

series_input = Input(shape=(None, data.training_series.shape[1]))
lstm = LSTM(8)(series_input, initial_state=dense_2)
out = Dense(1, activation="sigmoid")(lstm)

model = Model(inputs=[feature_input,series_input], outputs=out)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=["mape"])

however, I am just not sure on how to specify the initial state on the list correctly. I get

ValueError: An initial_state was passed that is not compatible with `cell.state_size`. Received `state_spec`=[<keras.engine.topology.InputSpec object at 0x11691d518>]; However `cell.state_size` is (8, 8)

which I can see is caused by the 3d batch dimension. I tried using Flatten, Permutation, and Resize layers but I don't believe that is correct. What am I missing and how can I connect these layers?

697

asked Jan 12 '18 20:01

modesitt

1 Answers

The first problem is that an LSTM(8) layer expects two initial states h_0 and c_0, each of dimension (None, 8). That's what it means by "cell.state_size is (8, 8)" in the error message.

If you only have one initial state dense_2, maybe you can switch to GRU (which requires only h_0). Or, you can transform your feature_input into two initial states.

The second problem is that h_0 and c_0 are of shape (batch_size, 8), but your dense_2 is of shape (batch_size, timesteps, 8). You need to deal with the time dimension before using dense_2 as initial states.

So maybe you can change your input shape into (data.training_features.shape[1],) or take average over timesteps with GlobalAveragePooling1D.

A working example would be:

feature_input = Input(shape=(5,))
dense_1_h = Dense(4, activation='relu')(feature_input)
dense_2_h = Dense(8, activation='relu')(dense_1_h)
dense_1_c = Dense(4, activation='relu')(feature_input)
dense_2_c = Dense(8, activation='relu')(dense_1_c)

series_input = Input(shape=(None, 5))
lstm = LSTM(8)(series_input, initial_state=[dense_2_h, dense_2_c])
out = Dense(1, activation="sigmoid")(lstm)
model = Model(inputs=[feature_input,series_input], outputs=out)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=["mape"])

answered Nov 05 '22 08:11

Yu-Yang

Related questions
                            
                                properly mock celery task that is being called inside another celery task
                            
                                python -m: Error while finding module specification
                            
                                zsh: command not found: flake8 but flake8 is installed
                            
                                Combine pandas string columns with missing values
                            
                                How to wrap text for an entire column using pandas?
                            
                                Neither DSN nor SERVER keyword supplied
                            
                                Alias for column in pandas
                            
                                How to print the first ten elements from Counter in python
                            
                                Display list result horizontally in python 3 rather than vertically
                            
                                Finding longest perfect match between two strings
                            
                                Remove empty entries during string split
                            
                                Python find numbers between range in list or array
                            
                                batch_input_shape tuple on Keras LSTM
                            
                                How to read the text from the alert box using Python + Selenium
                            
                                Forcing dict keys to be used as argument specifiers with str.format
                            
                                Why do scipy and numpy fft plots look different?
                            
                                Python Logging: Change "WARN" to "INFO"
                            
                                how to create a dataframe from a table in a word document (.docx) file using pandas
                            
                                Selecting Random Windows from Multidimensional Numpy Array Rows
                            
                                Vectorized 2-D moving window in numpy including edges

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LSTM Initial state from Dense layer

Tags:

python

machine-learning

keras

modesitt

People also ask

1 Answers

Yu-Yang

Recent Activity

Donate For Us