How to determine the learning rate and the variance in a gradient descent algorithm？

Tags:

I started to learn the machine learning last week. when I want to make a gradient descent script to estimate the model parameters, I came across a problem: How to choose a appropriate learning rate and variance。I found that，different (learning rate，variance) pairs may lead to different results, some times you even can't convergence. Also, if change to another training data set, a well-chose （learning rate，variance）pair probably will not work. For example(script below)，when I set the learning rate to 0.001 and variance to 0.00001, for 'data1', I can get the suitable theta0_guess and theta1_guess. But for ‘data2’, they can't make the algorithem convergence, even when I tried dozens of （learning rate，variance）pairs still can't reach to convergence.

So if anybody could tell me that are there some criteria or methods to determine the （learning rate，variance）pair.

import sys

data1 = [(0.000000,95.364693) ,
    (1.000000,97.217205) ,
    (2.000000,75.195834),
    (3.000000,60.105519) ,
    (4.000000,49.342380),
    (5.000000,37.400286),
    (6.000000,51.057128),
    (7.000000,25.500619),
    (8.000000,5.259608),
    (9.000000,0.639151),
    (10.000000,-9.409936),
    (11.000000, -4.383926),
    (12.000000,-22.858197),
    (13.000000,-37.758333),
    (14.000000,-45.606221)]

data2 = [(2104.,400.),
     (1600.,330.),
     (2400.,369.),
     (1416.,232.),
     (3000.,540.)]

def create_hypothesis(theta1, theta0):
    return lambda x: theta1*x + theta0

def linear_regression(data, learning_rate=0.001, variance=0.00001):


    theta0_guess = 1.
    theta1_guess = 1.


    theta0_last = 100.
    theta1_last = 100.

    m = len(data)

    while (abs(theta1_guess-theta1_last) > variance or abs(theta0_guess - theta0_last) > variance):

        theta1_last = theta1_guess
        theta0_last = theta0_guess

        hypothesis = create_hypothesis(theta1_guess, theta0_guess)

        theta0_guess = theta0_guess - learning_rate * (1./m) * sum([hypothesis(point[0]) - point[1] for point in data])
        theta1_guess = theta1_guess - learning_rate * (1./m) * sum([ (hypothesis(point[0]) - point[1]) * point[0] for point in data])   

    return ( theta0_guess,theta1_guess )



points = [(float(x),float(y)) for (x,y) in data1]

res = linear_regression(points)
print res

956

asked May 19 '13 23:05

zhoufanking

1 Answers

Plotting is the best way to see how your algorithm is performing. To see if you have achieved convergence you can plot the evolution of the cost function after each iteration, after a certain given of iteration you will see that it does not improve much you can assume convergence, take a look to the following code:

cost_f = []
while (abs(theta1_guess-theta1_last) > variance or abs(theta0_guess - theta0_last) > variance):

    theta1_last = theta1_guess
    theta0_last = theta0_guess

    hypothesis = create_hypothesis(theta1_guess, theta0_guess)
    cost_f.append((1./(2*m))*sum([ pow(hypothesis(point[0]) - point[1], 2) for point in data]))

    theta0_guess = theta0_guess - learning_rate * (1./m) * sum([hypothesis(point[0]) - point[1] for point in data])
    theta1_guess = theta1_guess - learning_rate * (1./m) * sum([ (hypothesis(point[0]) - point[1]) * point[0] for point in data])   

import pylab
pylab.plot(range(len(cost_f)), cost_f)
pylab.show()

Which will plot the following graphic (execution with learning_rate=0.01, variance=0.00001)

As you can see, after a thousand iteration you don't get much improvement. I normally declare convergence if the cost function decreases less than 0.001 in one iteration, but this just based on my own experience.

For choosing learning rate, the best thing you can do is also plot the cost function and see how it is performing, and always remember these two things:

if the learning rate is too small you will get slow convergence
if the learning rate is too large your cost function may not decrease in every iteration and therefore it will not converge

If you run your code choosing learning_rate > 0.029 and variance=0.001 you will be in the second case, gradient descent doesn't converge, while if you choose values learning_rate < 0.0001, variance=0.001 you will see that your algorithm takes a lot iteration to converge.

Not convergence example with learning_rate=0.03

Slow convergence example with learning_rate=0.0001 enter image description here

107

answered Oct 07 '22 13:10

jabaldonedo

Related questions
                            
                                python zip() function for a matrix [duplicate]
                            
                                Django model field default from model method
                            
                                PyInstaller "ImportError: No module named Pyinstaller"
                            
                                How to check if a zip file is encrypted using python's standard library zipfile?
                            
                                Writing a distributed queue in Amazon's DynamoDB
                            
                                How to draw images in tkinter window
                            
                                Defining a global function in a Python script
                            
                                Find Max in Nested Dictionary
                            
                                What are the use cases for Python's __new__?
                            
                                Debugging slow Django Admin views [closed]
                            
                                Python Object as Dictionary Value [closed]
                            
                                Python image library (PIL), how to compress image into desired file size?
                            
                                Django get_or_create, how to say commit=False
                            
                                Convert JSON to CSV using Python (Idle)
                            
                                Pymongo cursor limit(1) returns more than 1 result
                            
                                Including Local Variables in Django Error Emails
                            
                                How to specify in YAML to always create log file in the project's folder using dictConfig?
                            
                                how to access form data using flask?
                            
                                How to implement a watchdog timer in Python?
                            
                                Where I should put my python scripts in Linux?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to determine the learning rate and the variance in a gradient descent algorithm？

Tags:

python

machine-learning

gradient-descent

zhoufanking

People also ask

1 Answers

jabaldonedo

Recent Activity

Donate For Us