Is there any detailed explanations how do TimeDistributed, stateful and return_sequences work? Do I have to set shuffle=False in both cases? Does it work for windows (1-11, 2-12, 3-13 etc.) or should it be used in batches (1-11, 12-22, 13-33 etc.)
I'm particularly interested in LSTM layers.
This does not affect how layers work. The purpose of this is to have an additional "time" (it may not be time too) dimension. The wrapped layer will be applied to each slice of the input tensor considering this time dimension.
For instance, if a layer is expecting an input shape with 3 dimensions, say (batch, length, features)
, using the TimeDistributed
wrapper will make it expect 4 dimensions: (batch, timeDimension, length, features)
The layer will then be "copied" and applied equally to each element in the time dimension.
With an LSTM layer, it works the same. Although an LSTM layer already expects a time dimension in its input shape: (batch, timeSteps, features)
, you can use the TimeDistributed
to add yet another "time" dimension (which may mean anything, not exactly time) and make this LSTM
layer to be reused for each element in this new time dimension.
LSTM
- expects inputs (batch, timeSteps, features)
TimeDistributed(LSTM())
- expects inputs (batch, superSteps, timeSteps, features)
In any case, the LSTM
will only actually perform its recurrent calculations in the timeSteps
dimension. The other time dimension is just replicating this layer many times.
TimeDistributed + Dense:
The Dense
layer (and maybe a few others), already supports 3D inputs, although the standard is 2D: (batch, inputFeatures)
.
Using the TimeDistributed or not with Dense layers is optional and the result is the same: if your data is 3D, the Dense layer will be repeated for the second dimension.
This is well explained in the documentation.
With recurrent layers, keras will use the timeSteps
dimension to perform its recurrent steps. For each step, it will naturally have an output.
You can choose to get the outputs for all steps (return_sequences=True
) or to get just the last output (return_sequences=False
)
Consider an input shape like (batch, timeSteps, inputFeatures)
and a layer with outputFeatures
units:
return_sequences=True
, the output shape is (batch, timeSteps, outputFeatures)
return_sequences=False
, the output shape is (batch, outputFeatures)
In any case, if you use a TimeDistributed
wrapper, the superSteps
dimension will be in the input and the output, unchanged.
Usually, if you can put all your sequences with all their steps in an input array, everything is fine and you don't need stateful=True
layers.
Keras creates a "state" for each sequence in the batch. The batch dimension is equal to the number of sequences. When keras finishes processing a batch, it automatically resets the states, meaning: we reached the end (last time step) of the sequences, bring new sequences from the first step.
When using stateful=True
, these states will not be reset. This means that sending another batch to the model will not be interpreted as a new set of sequences, but additional steps for the sequences that were processed before. You must then model.reset_states()
manually to tell the model that you've reached the last step of the sequences, or that you will start new sequences.
The only case that needs shuffle=False
is this stateful=True
case. Because for each batch, many sequences are input. In every batch these sequences must be kept in the same order, so that the states for each sequence don't get mixed.
Stateful layers are good for:
So far, the only way I could work with windows was replicating data.
The input array should be organized in windows. One sequence per window step. You could optionally take advantage of the TimeDistributed
wrapper if you want to keep all window steps as a single batch entry. But you can make all steps be individual sequences as well.
The stateful=True
layer won't work with windows because of the states. If you input in a batch the steps from 1 to 12, the next batch will be expecting the step 13 as first step to keep the connection.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With