Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In pytorch how do you use add_param_group () with a optimizer?

The documentation is pretty vague and there aren't example codes to show you how to use it. The documentation for it is

Add a param group to the Optimizer s param_groups.

This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the Optimizer as training progresses.

Parameters: param_group (dict) – Specifies what Tensors should be optimized along with group optimization options. (specific) –

I am assuming I can get a param_group parameter by feeding the values I get from a model's state_dict()? E.g. all the actual weight values? I am asking this because I want to make a progressive network, which means I need to constantly feed Adam parameters from newly created convolutions and activations modules.

like image 732
Inkplay_ Avatar asked Aug 08 '18 23:08

Inkplay_


People also ask

How does Optimizer step work PyTorch?

step() makes the optimizer iterate over all parameters (tensors) it is supposed to update and use their internally stored grad to update their values. More info on computational graphs and the additional "grad" information stored in pytorch tensors can be found in this answer.

What is Optimizer Param_groups?

We can find optimizer. param_groups is a python list, which contains a dictionary. As to this example, it is: params: contains all parameters will be update by gradients. lr: current learning rate.


1 Answers

Per the docs, the add_param_group method accepts a param_group parameter that is a dict. Example of use:

import torch
import torch.optim as optim


w1 = torch.randn(3, 3)
w1.requires_grad = True
w2 = torch.randn(3, 3)
w2.requires_grad = True
o = optim.Adam([w1])
print(o.param_groups)

gives

[{'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[ 2.9064, -0.2141, -0.4037],
           [-0.5718,  1.0375, -0.6862],
           [-0.8372,  0.4380, -0.1572]])],
  'weight_decay': 0}]

now

o.add_param_group({'params': w2})
print(o.param_groups)

gives:

[{'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[ 2.9064, -0.2141, -0.4037],
           [-0.5718,  1.0375, -0.6862],
           [-0.8372,  0.4380, -0.1572]])],
  'weight_decay': 0},
 {'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[-0.0560,  0.4585, -0.7589],
           [-0.1994,  0.4557,  0.5648],
           [-0.1280, -0.0333, -1.1886]])],
  'weight_decay': 0}]
like image 110
iacolippo Avatar answered Sep 27 '22 22:09

iacolippo