I am training my method. I got the result as below. Is it a good learning rate? If not, is it high or low? This is my result
lr_policy: "step" gamma: 0.1 stepsize: 10000 power: 0.75 # lr for unnormalized softmax base_lr: 0.001 # high momentum momentum: 0.99 # no gradient accumulation iter_size: 1 max_iter: 100000 weight_decay: 0.0005 snapshot: 4000 snapshot_prefix: "snapshot/train" type:"Adam"
This is reference
With low learning rates the improvements will be linear. With high learning rates they will start to look more exponential. Higher learning rates will decay the loss faster, but they get stuck at worse values of loss
Adam's learning rate may need tuning and is not necessarily the best algorithm. But there is also research showing that it may be beneficial to use other (that Adam's) learning rate schedules. So it is not that easy, Adam isn't necessarily enough.
A traditional default value for the learning rate is 0.1 or 0.01, and this may represent a good starting point on your problem.
LearningRateSchedule , or a callable that takes no arguments and returns the actual value to use, The learning rate. Defaults to 0.001.
Further, learning rate decay can also be used with Adam. The paper uses a decay rate alpha = alpha/sqrt(t) updted each epoch (t) for the logistic regression demonstration. The TensorFlow documentation suggests some tuning of epsilon: The default value of 1e-8 for epsilon might not be a good default in general.
The learning rate looks a bit high. The curve decreases too fast for my taste and flattens out very soon. I would try 0.0005 or 0.0001 as a base learning rate if I wanted to get additional performance. You can quit after several epochs anyways if you see that this does not work.
The question you have to ask yourself though is how much performance do you need and how close you are to accomplishing the performance required. I mean that you are probably training a neural network for a specific purpose. Often times you can get more performance out of the network by increasing its capacity, instead of fine tuning the learning rate which is pretty good if not perfect anyways.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With