Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why won't Perceptron Learning Algorithm converge?

I have implemented the Perceptron Learning Algorithm in Python as below. Even with 500,000 iterations, it still won't converge.

I have a training data matrix X with target vector Y, and a weight vector w to be optimized.

My update rule is:

while(exist_mistakes): 
    # dot product to check for mistakes
    output = [np.sign(np.dot(X[i], w)) == Y[i] for i in range(0, len(X))]

    # find index of mistake. (choose randomly in order to avoid repeating same index.) 
    n = random.randint(0, len(X)-1)
    while(output[n]): # if output is true here, choose again
        n = random.randint(0, len(X)-1)

    # once we have found a mistake, update
    w = w + Y[n]*X[n] 

Is this wrong? Or why is it not converging even after 500,000 iterations?

like image 392
manbearpig Avatar asked Oct 03 '13 01:10

manbearpig


People also ask

What is the condition for convergence of a perceptron learning algorithm?

Perceptron Convergence Theorem: For any finite set of linearly separable labeled examples, the Perceptron Learning Algorithm will halt after a finite number of iterations. In other words, after a finite number of iterations, the algorithm yields a vector w that classifies perfectly all the examples.

Does the Multilayer perceptron always converge?

If the training set is linearly separable, then the perceptron is guaranteed to converge. Furthermore, there is an upper bound on the number of times the perceptron will adjust its weights during the training.

What is the limitation of perceptron learning rule?

Perceptron networks have several limitations. First, the output values of a perceptron can take on only one of two values (0 or 1) because of the hard-limit transfer function. Second, perceptrons can only classify linearly separable sets of vectors.

Does a perceptron converge for non linearly separable data?

The Perceptron always converges to the best linear separator for a given dataset. The convergence criteria for Perceptron depends on the initial value of the weight vector. If the dataset is not linearly separable, the Perceptron algorithm does not converge and keeps cycling between some sets of weights.


1 Answers

Perceptrons by Minsky and Papert (in)famously demonstrated in 1969 that the perceptron learning algorithm is not guaranteed to converge for datasets that are not linearly separable.

If you're sure that your dataset is linearly separable, you might try adding a bias to each of your data vectors, as described by the question: Perceptron learning algorithm not converging to 0 -- adding a bias can help model decision boundaries that do not pass through the origin.

Alternatively, if you'd like to use a variant of the perceptron learning algorithm that is guaranteed to converge to a margin of specified width, even for datasets that are not linearly separable, have a look at the Averaged Perceptron -- PDF. The averaged perceptron is an approximation to the voted perceptron, which was introduced (as far as I know) in a nice paper by Freund and Schapire, "Large Margin Classification Using the Perceptron Algorithm" -- PDF.

Using an averaged perceptron, you make a copy of the parameter vector after each presentation of a training example during training. The final classifier uses the mean of all parameter vectors.

like image 79
lmjohns3 Avatar answered Oct 15 '22 11:10

lmjohns3