Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to take the average of the weights of two networks?

Suppose in PyTorch I have model1 and model2 which have the same architecture. They were further trained on same data or one model is an earlier version of the othter, but it is not technically relevant for the question. Now I want to set the weights of model to be the average of the weights of model1 and model2. How would I do that in PyTorch?

like image 959
patapouf_ai Avatar asked Feb 01 '18 10:02

patapouf_ai


People also ask

How do you compare the weights of two neural networks?

One way to compare two neural networks is to compare how similar their predictions are, on typical instances. Ideally, we'd like to compute the expected value of this similarity, taken over the distribution on instances. However, as you say, the input space is high-dimensional, so the integral is hard to compute.

What is State_dict in PyTorch?

A state_dict is an integral entity if you are interested in saving or loading models from PyTorch. Because state_dict objects are Python dictionaries, they can be easily saved, updated, altered, and restored, adding a great deal of modularity to PyTorch models and optimizers.


1 Answers

beta = 0.5 #The interpolation parameter    
params1 = model1.named_parameters()
params2 = model2.named_parameters()

dict_params2 = dict(params2)

for name1, param1 in params1:
    if name1 in dict_params2:
        dict_params2[name1].data.copy_(beta*param1.data + (1-beta)*dict_params2[name1].data)

model.load_state_dict(dict_params2)

Taken from pytorch forums. You could grab the parameters, transform and load them back but make sure the dimensions match.

Also I would be really interested in knowing about your findings with these..

like image 180
Littleone Avatar answered Nov 10 '22 01:11

Littleone