PyTorch - How to deactivate dropout in evaluation mode

Tags:

This is the model I defined it is a simple lstm with 2 fully connect layers.

import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)
    def forward(self, input):
        out,_=self.lstm(input)
        out=nn.Dropout(p=0.3)(out)
        out=self.linear1(out)
        out=nn.Dropout(p=0.3)(out)
        out=self.linear2(out)
        return out

x_train and x_val are float dataframe with shape (4478,30), while y_train and y_val are float df with shape (4478,10)

    x_train.head()
Out[271]: 
       0       1       2       3    ...        26      27      28      29
0  1.6110  1.6100  1.6293  1.6370   ...    1.6870  1.6925  1.6950  1.6905
1  1.6100  1.6293  1.6370  1.6530   ...    1.6925  1.6950  1.6905  1.6960
2  1.6293  1.6370  1.6530  1.6537   ...    1.6950  1.6905  1.6960  1.6930
3  1.6370  1.6530  1.6537  1.6620   ...    1.6905  1.6960  1.6930  1.6955
4  1.6530  1.6537  1.6620  1.6568   ...    1.6960  1.6930  1.6955  1.7040

[5 rows x 30 columns]

x_train.shape
Out[272]: (4478, 30)

Define the varible and do one time bp, I can find out the vaildation loss is 1.4941

model=mylstm(30,10,200,100).double()
from torch import optim
optimizer=optim.RMSprop(model.parameters(), lr=0.001, alpha=0.9)
criterion=nn.L1Loss()
input_=torch.autograd.Variable(torch.from_numpy(np.array(x_train)))
target=torch.autograd.Variable(torch.from_numpy(np.array(y_train)))
input2_=torch.autograd.Variable(torch.from_numpy(np.array(x_val)))
target2=torch.autograd.Variable(torch.from_numpy(np.array(y_val)))
optimizer.zero_grad()
output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_),target2)

moniter
Out[274]: tensor(1.4941, dtype=torch.float64, grad_fn=<L1LossBackward>)

But I called forward function again I get a different number due to randomness of dropout

moniter=criterion(model(input2_),target2)
moniter
Out[275]: tensor(1.4943, dtype=torch.float64, grad_fn=<L1LossBackward>)

what should I do that I can eliminate all the dropout in predicting phrase?

I tried eval():

moniter=criterion(model.eval()(input2_),target2)
moniter
Out[282]: tensor(1.4942, dtype=torch.float64, grad_fn=<L1LossBackward>)

moniter=criterion(model.eval()(input2_),target2)
moniter
Out[283]: tensor(1.4945, dtype=torch.float64, grad_fn=<L1LossBackward>)

And pass an addtional parameter p to control dropout:

import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)
    def forward(self, input,p):
        out,_=self.lstm(input)
        out=nn.Dropout(p=p)(out)
        out=self.linear1(out)
        out=nn.Dropout(p=p)(out)
        out=self.linear2(out)
        return out

model=mylstm(30,10,200,100,0.3).double()

output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_,0),target2)
Traceback (most recent call last):

  File "<ipython-input-286-e49b6fac918b>", line 1, in <module>
    output=model(input_)

  File "D:\Users\shan xu\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)

TypeError: forward() missing 1 required positional argument: 'p'

But neither of them worked.

740

asked Dec 21 '18 05:12

Tommy Yu

2 Answers

I add this answer just because I'm facing now the same issue while trying to reproduce Deep Bayesian active learning through dropout disagreement. If you need to keep dropout active (for example to bootstrap a set of different predictions for the same test instances) you just need to leave the model in training mode, there is no need to define your own dropout layer.

Since in pytorch you need to define your own prediction function, you can just add a parameter to it like this:

def predict_class(model, test_instance, active_dropout=False):
    if active_dropout:
        model.train()
    else:
        model.eval()

answered Sep 19 '22 11:09

Edoardo Guerriero

You have to define your nn.Dropout layer in your __init__ and assign it to your model to be responsive for calling eval().

So changing your model like this should work for you:

class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)

        # define dropout layer in __init__
        self.drop_layer = nn.Dropout(p=p)
    def forward(self, input):
        out,_= self.lstm(input)

        # apply model dropout, responsive to eval()
        out= self.drop_layer(out)
        out= self.linear1(out)

        # apply model dropout, responsive to eval()
        out= self.drop_layer(out)
        out= self.linear2(out)
        return out

If you change it like this dropout will be inactive as soon as you call eval().

NOTE: If you want to continue training afterwards you need to call train() on your model to leave evaluation mode.

You can also find a small working example for dropout with eval() for evaluation mode here: nn.Dropout vs. F.dropout pyTorch

178

answered Sep 22 '22 11:09

MBT

Related questions
                            
                                Is nesting React Context Provider and consuming those with useContext a problem?
                            
                                Dart/Flutter - "yield" inside a callback function
                            
                                How to clear GPU memory after PyTorch model training without restarting kernel
                            
                                Using the Google Cloud Platform SDK CLI to List all Active Resources Under a Given Project
                            
                                Difference between git-lfs and dvc
                            
                                How to make a bone shaped button
                            
                                Latent Dirichlet Allocation, pitfalls, tips and programs
                            
                                NHibernate - good complete working Helper class for managing SessionFactory/Session [closed]
                            
                                Creating a custom authentication with Acegi/Spring Security
                            
                                questions re: current state of GUI programming with Python
                            
                                How to print a list, dict or collection of objects, in Python
                            
                                jQuery autocomplete with images

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With