PyTorch: passing numpy array for weight initialization

Tags:

I'd like to initialize the parameters of RNN with np arrays.

In the following example, I want to pass w to the parameters of rnn. I know pytorch provides many initialization methods like Xavier, uniform, etc., but is there way to initialize the parameters by passing numpy arrays?

import numpy as np
import torch as nn
rng = np.random.RandomState(313)
w = rng.randn(input_size, hidden_size).astype(np.float32)

rnn = nn.RNN(input_size, hidden_size, num_layers)

643

asked Aug 01 '18 08:08

ytrewq

2 Answers

First, let's note that nn.RNN has more than one weight variable, c.f. the documentation:

Variables:

weight_ih_l[k] – the learnable input-hidden weights of the k-th layer, of shape (hidden_size * input_size) for k = 0. Otherwise, the shape is (hidden_size * hidden_size)

weight_hh_l[k] – the learnable hidden-hidden weights of the k-th layer, of shape (hidden_size * hidden_size)

bias_ih_l[k] – the learnable input-hidden bias of the k-th layer, of shape (hidden_size)

bias_hh_l[k] – the learnable hidden-hidden bias of the k-th layer, of shape (hidden_size)

Now, each of these variables (Parameter instances) are attributes of your nn.RNN instance. You can access them, and edit them, two ways, as show below:

Solution 1: Accessing all the RNN Parameter attributes by name (rnn.weight_hh_lK, rnn.weight_ih_lK, etc.):

import torch
from torch import nn
import numpy as np

input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)

rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)

def set_nn_parameter_data(layer, parameter_name, new_data):
    param = getattr(layer, parameter_name)
    param.data = new_data

for i in range(num_layers):
    weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    set_nn_parameter_data(rnn, "weight_hh_l{}".format(i), 
                          torch.from_numpy(weights_hh_layer_i))
    set_nn_parameter_data(rnn, "weight_ih_l{}".format(i), 
                          torch.from_numpy(weights_ih_layer_i))

    if use_bias:
        bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
        bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
        set_nn_parameter_data(rnn, "bias_hh_l{}".format(i), 
                              torch.from_numpy(bias_hh_layer_i))
        set_nn_parameter_data(rnn, "bias_ih_l{}".format(i), 
                              torch.from_numpy(bias_ih_layer_i))

Solution 2: Accessing all the RNN Parameter attributes through rnn.all_weights list attribute:

import torch
from torch import nn
import numpy as np

input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)

rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)

for i in range(num_layers):
    weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
    rnn.all_weights[i][0].data = torch.from_numpy(weights_ih_layer_i)
    rnn.all_weights[i][1].data = torch.from_numpy(weights_hh_layer_i)

    if use_bias:
        bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
        bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
        rnn.all_weights[i][2].data = torch.from_numpy(bias_ih_layer_i)
        rnn.all_weights[i][3].data = torch.from_numpy(bias_hh_layer_i)

200

answered Oct 24 '22 22:10

benjaminplanche

As a detailed answer is provided, I just to add one more sentence. The parameters of an nn.Module are Tensors (previously, it used to be autograd variables, which is deperecated in Pytorch 0.4). So, essentially you need to use the torch.from_numpy() method to convert the Numpy array to Tensor and then use them to initialize the nn.Module parameters.

answered Oct 24 '22 22:10

Wasi Ahmad

Related questions
                            
                                Trouble using lambda function within my scraper
                            
                                Indexing and percolating documents with elasticsearch-dsl-py
                            
                                Python: Time and space complexity of creating size n^2 tuples
                            
                                Django asyncio call in views doesn't work
                            
                                Process communication of Python's Multiprocessing
                            
                                How to apply a custom function to specific columns in a matrix in PyTorch
                            
                                Celery not work: Cannot connect to amqp://guest:**@127.0.0.1:5672//
                            
                                Access to Flask Global Variables in Blueprint Apps
                            
                                python: pandas.DataFrame，how to avoid keyerror?
                            
                                Using negative numbers in pandas.DataFrame.query() expression
                            
                                Fastest way to check if a list is present in a list of lists
                            
                                Tensorflow: stack all row pairs from a tensor
                            
                                Python ggplot and ggplotly
                            
                                How to use pytest fixtures with Unittest methods
                            
                                Meaning of batch_size in model.evaluate()
                            
                                Amazon lambda does not show python logs
                            
                                Is it appropriate to raise an EnvironmentError for os.environ?
                            
                                Is Tensorflow's Between-graph replication an example of data parallelism?
                            
                                Python: A* routing from dataframe with longitude and latitude
                            
                                How can I reset a Django test database id's after each test?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PyTorch: passing numpy array for weight initialization

Tags:

python

initialization

numpy

pytorch

rnn

ytrewq

People also ask

2 Answers

benjaminplanche

Wasi Ahmad

Recent Activity

Donate For Us