I know there are a lot of questions to this topic, but I don't understand why in my case both options are possible. My input shape in the LSTM is (10,24,2) and my hidden_size is 8.
model = Sequential()
model.add(LSTM(hidden_size, return_sequences=True, stateful = True,
batch_input_shape=((10, 24, 2))))
model.add(Dropout(0.1))
Why is it possible to either add this line below:
model.add(TimeDistributed(Dense(2))) # Option 1
or this one:
model.add(Dense(2)) # Option 2
Shouldn't Option 2
lead to a compilation error, because it expects a two-dimensional input?
TimeDistributed(layer, **kwargs) This wrapper allows to apply a layer to every temporal slice of an input. Every input should be at least 3D, and the dimension of index one of the first input will be considered to be the temporal dimension.
TimeDistributed layer is very useful to work with time series data or video frames. It allows to use a layer for each input. That means that instead of having several input “models”, we can use “one model” applied to each input. Then GRU or LSTM can help to manage the data in “time”.
In your case the 2 models you define are identical.
This is caused by the fact that you use the return_sequences=True
parameter which means that the Dense
layer is applied to every timestep just like TimeDistributedDense
but if you switch to False
then the 2 models are not identical and an error is raised in case of TimeDistributedDense
version though not in the Dense
one.
A more thorough explanation is provided here also to a similar situation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With