Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if loss function is multiplied by a constant?

What will happen if I multiply a constant to the loss function? I think I will get a larger gradient, right? Is it equal to having a larger learning rate?

like image 298
xyd Avatar asked Jul 18 '16 04:07

xyd


2 Answers

Basically - it depends on many things:

  1. If you use a classic stochastic / batch / full batch learning with an update rule, where:

    new_weights = old_weights - learning_rate * gradient

then due to multiplication commutativity - your claim is true.

  1. If you are using any learning method which has an adaptive learning rate (like ADAM or rmsprop)- then things change a little bit. Then still - your gradients would be affected by multiplication - but a learning rate could not be affected at all. It depends on how new value of a cost function will cooperate with learning algorithm.

  2. If you use a learning method in which you have an adaptive gradient but not adaptive learning rate - usually learning rate is affected in a same way like in point 1. (e.g. in momentum methods).

like image 110
Marcin Możejko Avatar answered Sep 29 '22 08:09

Marcin Możejko


Yes, you are right. It is equivalent to changing the learning rate.

like image 37
kangshiyin Avatar answered Sep 29 '22 08:09

kangshiyin