Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the default weight initializer for conv in pytorch?

The question How to initialize weights in PyTorch? shows how to initialize the weights in Pytorch. However, what is the default weight initializer for Convand Dense in Pytorch? What distribution does Pytorch use?

like image 715
urningod Avatar asked Apr 13 '18 12:04

urningod


1 Answers

Each pytorch layer implements the method reset_parameters which is called at the end of the layer initialization to initialize the weights. You can find the implementation of the layers here.

For the dense layer which in pytorch is called linear for example, weights are initialized uniformly

stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)

where self.weight.size(1) is the number of inputs. This is done to keep the variance of the distributions of each layer relatively similar at the beginning of training by normalizing it to one. You can read a more detailed explanation here.

For the convolutional layer the initialization is basically the same. You just compute the number of inputs by multiplying the number of channels with the kernel size.

like image 165
McLawrence Avatar answered Nov 03 '22 15:11

McLawrence