I am using features of variable length videos to train one layer LSTM. Video sizes are changing from 10 to 35 frames. I am using batch size of 1. I have the following code:
lstm_model = LSTMModel(4096, 4096, 1, 64)
for step, (video_features, label) in enumerate(data_loader):
bx = Variable(score.view(-1, len(video_features), len(video_features[0]))) #examples = 1x12x4096, 1x5x4096
output = lstm_model(bx)
Lstm model is;
class LSTMModel(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
super(LSTMModel, self).__init__()
self.l1 = nn.LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
self.out = nn.Linear(hidden_size, num_classes)
def forward(self, x):
r_out, (h_n, h_c) = self.l1(x, None) #None represents zero initial hidden state
out = self.out(r_out[:, -1, :])
return out
I just want to ask; am I doing the right for training LSTM with variable size input. The code works okay and loss decreases but I am not sure if I am doing the right thing. Because I haven't used LSTMs in Pytorch before.
That means the input_size of the LSTM needs to be 768. The hidden_size is not dependent on your input, but rather how many features the LSTM should create, which is then used for the hidden state as well as the output, since that is the last hidden state.
Batch Size is the number of samples we send to the model at a time. In this example, we have batch size = 2 but you can take it 4, 8,16, 32, 64 etc depends on the memory (basically in 2's power) Sequence Length is the length of the sequence of input data (time step:0,1,2…
Here the hidden_size of the LSTM layer would be 512 as there are 512 units in each LSTM cell and the num_layers would be 2. The num_layers is the number of layers stacked on top of each other.
An LSTM cell takes the following inputs: input, (h_0, c_0). input : a tensor of inputs of shape (batch, input_size) , where we declared input_size in the creation of the LSTM cell. h_0 : a tensor containing the initial hidden state for each element in the batch, of shape (batch, hidden_size).
Yes, you code is correct and will work always for a batch size of 1. But, if you want to use a batch size other than 1, you’ll need to pack your variable size input into a sequence, and then unpack after LSTM. You can find more details in my answer to a similar question.
P.S. - You should post such questions to codereview
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With