I am attempting to mask (force to zero) specific weight values in PyTorch. The weights I am trying to mask are defined as so in the def __init__
class LSTM_MASK(nn.Module):
def __init__(self, options, inp_dim):
super(LSTM_MASK, self).__init__()
....
self.wfx = nn.Linear(input_dim, curernt_output, bias=add_bias)
The mask is also defined in def __init__
as
self.mask_use = torch.Tensor(curernt_output, input_dim)
The mask is a constant and the .requires_grad_()
is False
for the mask parameter. Now in the def forward
part of the class I attempt to do an element-wise multiplication of the weight parameter and the mask before the linear operation is completed
def forward(self, x):
....
self.wfx.weight = self.wfx.weight * self.mask_use
wfx_out = self.wfx(x)
I get ann error message:
self.wfx.weight = self.wfx.weight * self.mask_use
File "/home/xyz/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 537, in __setattr__
.format(torch.typename(value), name))
TypeError: cannot assign 'torch.cuda.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)
But when I check on the both parameters with .type()
both of them come up as torch.cuda.FloatTensor
. I am not sure why there is an error here.
The element-wise operation always returns a FloatTensor
. It is not possible to assign normal tensors as weight
of layers.
There are two possible options to deal with it. You can assign it to the data
attribute of your weight, there it is possible assign normal tensors.
Or alternatively you convert your result to an nn.Parameter
itself, then you can assign it to wfx.weight
.
Here is an example which shows both ways:
import torch
import torch.nn as nn
wfx = nn.Linear(10, 10)
mask_use = torch.rand(10, 10)
#wfx.weight = wfx.weight * mask_use #your example - this raises an error
# Option 1: write directly to data
wfx.weight.data = wfx.weight * mask_use
# Option 2: convert result to nn.Parameter and write to weight
wfx.weight = nn.Parameter(wfx.weight * mask_use)
Disclaimer: When using an =
(assignment) on the weights you are replacing the weights tensor of your parameter. This may have unwanted effects on the graph resp. the optimization step.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With