Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resilient backpropagation neural network - question about gradient

First I want to say that I'm really new to neural networks and I don't understand it very good ;)

I've made my first C# implementation of the backpropagation neural network. I've tested it using XOR and it looks it work.

Now I would like change my implementation to use resilient backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop).

The definition says: "Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on each "weight".

Could somebody tell me what partial derivative over all patterns is? And how should I compute this partial derivative for a neuron in hidden layer.

Thanks a lot

UPDATE:

My implementation base on this Java code: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java

My backPropagate method looks like this:

public double backPropagate(double[] targets)
    {
        double error, change;

        // calculate error terms for output
        double[] output_deltas = new double[outputsNumber];

        for (int k = 0; k < outputsNumber; k++)
        {

            error = targets[k] - activationsOutputs[k];
            output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error;
        }

        // calculate error terms for hidden
        double[] hidden_deltas = new double[hiddenNumber];

        for (int j = 0; j < hiddenNumber; j++)
        {
            error = 0.0;

            for (int k = 0; k < outputsNumber; k++)
            {
                error = error + output_deltas[k] * weightsOutputs[j, k];
            }

            hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error;
        }

        //update output weights
        for (int j = 0; j < hiddenNumber; j++)
        {
            for (int k = 0; k < outputsNumber; k++)
            {
                change = output_deltas[k] * activationsHidden[j];
                weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k];
                lastChangeWeightsForMomentumOutpus[j, k] = change;

            }
        }

        // update input weights
        for (int i = 0; i < inputsNumber; i++)
        {
            for (int j = 0; j < hiddenNumber; j++)
            {
                change = hidden_deltas[j] * activationsInputs[i];
                weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j];
                lastChangeWeightsForMomentumInputs[i, j] = change;
            }
        }

        // calculate error
        error = 0.0;

        for (int k = 0; k < outputsNumber; k++)
        {
            error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]);
        }

        return error;
    }

So can I use change = hidden_deltas[j] * activationsInputs[i] variable as a gradient (partial derivative) for checking the sing?

like image 702
Rafal Spacjer Avatar asked May 19 '10 11:05

Rafal Spacjer


People also ask

Does backpropagation learning is based on gradient?

The Backpropagation AlgorithmStandard backpropagation is a gradient descent algorithm in which the network weights are moved along the negative of the gradient of the performance function. The combination of weights that minimizes the error function is considered a solution to the learning problem.

What is an advantage of resilient backpropagation over backpropagation?

But it has two main advantages over back propagation: First, training with Rprop is often faster than training with back propagation. Second, Rprop doesn't require you to specify any free parameter values, as opposed to back propagation which needs values for the learning rate (and usually an optional momentum term).

What is the purpose of resilient backpropagation?

The purpose of the resilient backpropagation (Rprop) training algorithm is to eliminate these harmful effects of the magnitudes of the partial derivatives. Only the sign of the derivative can determine the direction of the weight update; the magnitude of the derivative has no effect on the weight update.

What can backpropagation be used to compute gradients for?

Backpropagation is an algorithm used in machine learning that works by calculating the gradient of the loss function, which points us in the direction of the value that minimizes the loss function. It relies on the chain rule of calculus to calculate the gradient backward through the layers of a neural network.


1 Answers

I think the "over all patterns" simply means "in every iteration"... take a look at the RPROP paper

For the paritial derivative: you've already implemented the normal back-propagation algorithm. This is a method for efficiently calculate the gradient... there you calculate the δ values for the single neurons, which are in fact the negative ∂E/∂w values, i.e. the parital derivative of the global error as function of the weights.

so instead of multiplying the weights with these values, you take one of two constants (η+ or η-), depending on whether the sign has changed

like image 133
king_nak Avatar answered Oct 10 '22 04:10

king_nak