Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyTorch: passing numpy array for weight initialization

I'd like to initialize the parameters of RNN with np arrays.

In the following example, I want to pass w to the parameters of rnn. I know pytorch provides many initialization methods like Xavier, uniform, etc., but is there way to initialize the parameters by passing numpy arrays?

import numpy as np
import torch as nn
rng = np.random.RandomState(313)
w = rng.randn(input_size, hidden_size).astype(np.float32)

rnn = nn.RNN(input_size, hidden_size, num_layers)
like image 643
ytrewq Avatar asked Aug 01 '18 08:08

ytrewq


People also ask

How do you initialize weights in PyTorch?

One of the most popular way to initialize weights is to use a class function that we can invoke at the end of the __init__ function in a custom PyTorch model. def __init__(self): # .

Do I need to initialize weights in PyTorch?

Weight Initializations with PyTorchBy default, PyTorch uses Lecun initialization, so nothing new has to be done here compared to using Normal, Xavier or Kaiming initialization.

Is PyTorch CPU faster than Numpy?

Tensors in CPU and GPU It is nearly 15 times faster than Numpy for simple matrix multiplication!

Is PyTorch better than Numpy?

Even if you already know Numpy, there are still a couple of reasons to switch to PyTorch for tensor computation. The main reason is the GPU acceleration. As you'll see, using a GPU with PyTorch is super easy and super fast. If you do large computations, this is beneficial because it speeds things up a lot.


2 Answers

First, let's note that nn.RNN has more than one weight variable, c.f. the documentation:

Variables:

  • weight_ih_l[k] – the learnable input-hidden weights of the k-th layer, of shape (hidden_size * input_size) for k = 0. Otherwise, the shape is (hidden_size * hidden_size)
  • weight_hh_l[k] – the learnable hidden-hidden weights of the k-th layer, of shape (hidden_size * hidden_size)
  • bias_ih_l[k] – the learnable input-hidden bias of the k-th layer, of shape (hidden_size)
  • bias_hh_l[k] – the learnable hidden-hidden bias of the k-th layer, of shape (hidden_size)

Now, each of these variables (Parameter instances) are attributes of your nn.RNN instance. You can access them, and edit them, two ways, as show below:

  • Solution 1: Accessing all the RNN Parameter attributes by name (rnn.weight_hh_lK, rnn.weight_ih_lK, etc.):
import torch
from torch import nn
import numpy as np

input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)

rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)

def set_nn_parameter_data(layer, parameter_name, new_data):
    param = getattr(layer, parameter_name)
    param.data = new_data

for i in range(num_layers):
    weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    set_nn_parameter_data(rnn, "weight_hh_l{}".format(i), 
                          torch.from_numpy(weights_hh_layer_i))
    set_nn_parameter_data(rnn, "weight_ih_l{}".format(i), 
                          torch.from_numpy(weights_ih_layer_i))

    if use_bias:
        bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
        bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
        set_nn_parameter_data(rnn, "bias_hh_l{}".format(i), 
                              torch.from_numpy(bias_hh_layer_i))
        set_nn_parameter_data(rnn, "bias_ih_l{}".format(i), 
                              torch.from_numpy(bias_ih_layer_i))
  • Solution 2: Accessing all the RNN Parameter attributes through rnn.all_weights list attribute:
import torch
from torch import nn
import numpy as np

input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)

rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)

for i in range(num_layers):
    weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    rnn.all_weights[i][0].data = torch.from_numpy(weights_ih_layer_i)
    rnn.all_weights[i][1].data = torch.from_numpy(weights_hh_layer_i)

    if use_bias:
        bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
        bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
        rnn.all_weights[i][2].data = torch.from_numpy(bias_ih_layer_i)
        rnn.all_weights[i][3].data = torch.from_numpy(bias_hh_layer_i)
like image 200
benjaminplanche Avatar answered Oct 24 '22 22:10

benjaminplanche


As a detailed answer is provided, I just to add one more sentence. The parameters of an nn.Module are Tensors (previously, it used to be autograd variables, which is deperecated in Pytorch 0.4). So, essentially you need to use the torch.from_numpy() method to convert the Numpy array to Tensor and then use them to initialize the nn.Module parameters.

like image 42
Wasi Ahmad Avatar answered Oct 24 '22 22:10

Wasi Ahmad