Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

‘DataParallel’ object has no attribute ‘init_hidden’

What I want to do is using DataParallel in my custom RNN class.

It seems like I initialized hidden_0 in a wrong way...

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, n_layers=1):
        super(RNN, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.n_layers = n_layers

        self.encoder = nn.Embedding(input_size, hidden_size)
        self.gru = nn.GRU(hidden_size, hidden_size, n_layers,batch_first = True)
        self.decoder = nn.Linear(hidden_size, output_size)
        self.init_hidden(batch_size)
    
    
    def forward(self, input):
        input = self.encoder(input)
        output, self.hidden = self.gru(input,self.hidden)
        output = self.decoder(output.contiguous().view(-1,self.hidden_size))
        output = output.contiguous().view(batch_size,num_steps,N_CHARACTERS)
        #print (output.size())10,50,67
    
        return output

    def init_hidden(self,batch_size):
        self.hidden = Variable(T.zeros(self.n_layers, batch_size, self.hidden_size).cuda())

And I call the network in this way:

decoder = T.nn.DataParallel(RNN(N_CHARACTERS, HIDDEN_SIZE, N_CHARACTERS), dim=1).cuda()

Then start training:

for epoch in range(EPOCH_):
    hidden = decoder.init_hidden()

But I get the error and I have no ideal how to fix it…

'DataParallel' object has no attribute 'init_hidden'

Thanks for your help!

like image 874
Weimin Chan Avatar asked May 21 '18 04:05

Weimin Chan


2 Answers

A workaround I did was:

self.model = model 
# Since if the model is wrapped by the `DataParallel` class, you won't be able to access its attributes
# unless you write `model.module` which breaks the code compatibility. We use `model_attr_accessor` for attributes
# accessing only.
if isinstance(model, DataParallel):
    self.model_attr_accessor = model.module
else:
    self.model_attr_accessor = model

This gives me the advantage of having the model distributed across my GPUs when I do self.model(input) (i.e., when it's wrapped by the DataParallel); and when I need to access its attributes I just do self.model_attr_accessor.<<WHATEVER>>. Also, this design gives me a more modular way of accessing the attributes from several functions without having if-statements in all of them to check if it's wrapped by the DataParallel or not.

On the other hand, if you had written model.module.<<WHATEVER>> and the model wasn't wrapped by DataParallel, this will raise an error saying that you model does not have module attribute.


However, a more compact implementation is to create a customized DataParallel like this:

class _CustomDataParallel(nn.Module):
    def __init__(self, model):
        super(_CustomDataParallel, self).__init__()
        self.model = nn.DataParallel(model).cuda()

    def forward(self, *input):
        return self.model(*input)

    def __getattr__(self, name):
        try:
            return super().__getattr__(name)
        except AttributeError:
            return getattr(self.model.module, name)
like image 178
ndrwnaguib Avatar answered Nov 16 '22 03:11

ndrwnaguib


When using DataParallel your original module will be in attribute module of the parallel module:

for epoch in range(EPOCH_):
    hidden = decoder.module.init_hidden()
like image 20
djstrong Avatar answered Nov 16 '22 03:11

djstrong