First I want to say that I'm really new to neural networks and I don't understand it very good ;)
I've made my first C# implementation of the backpropagation neural network. I've tested it using XOR and it looks it work.
Now I would like change my implementation to use resilient backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop).
The definition says: "Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on each "weight".
Could somebody tell me what partial derivative over all patterns is? And how should I compute this partial derivative for a neuron in hidden layer.
Thanks a lot
UPDATE:
My implementation base on this Java code: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java
My backPropagate method looks like this:
public double backPropagate(double[] targets)
{
double error, change;
// calculate error terms for output
double[] output_deltas = new double[outputsNumber];
for (int k = 0; k < outputsNumber; k++)
{
error = targets[k] - activationsOutputs[k];
output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error;
}
// calculate error terms for hidden
double[] hidden_deltas = new double[hiddenNumber];
for (int j = 0; j < hiddenNumber; j++)
{
error = 0.0;
for (int k = 0; k < outputsNumber; k++)
{
error = error + output_deltas[k] * weightsOutputs[j, k];
}
hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error;
}
//update output weights
for (int j = 0; j < hiddenNumber; j++)
{
for (int k = 0; k < outputsNumber; k++)
{
change = output_deltas[k] * activationsHidden[j];
weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k];
lastChangeWeightsForMomentumOutpus[j, k] = change;
}
}
// update input weights
for (int i = 0; i < inputsNumber; i++)
{
for (int j = 0; j < hiddenNumber; j++)
{
change = hidden_deltas[j] * activationsInputs[i];
weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j];
lastChangeWeightsForMomentumInputs[i, j] = change;
}
}
// calculate error
error = 0.0;
for (int k = 0; k < outputsNumber; k++)
{
error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]);
}
return error;
}
So can I use change = hidden_deltas[j] * activationsInputs[i]
variable as a gradient (partial derivative) for checking the sing?
The Backpropagation AlgorithmStandard backpropagation is a gradient descent algorithm in which the network weights are moved along the negative of the gradient of the performance function. The combination of weights that minimizes the error function is considered a solution to the learning problem.
But it has two main advantages over back propagation: First, training with Rprop is often faster than training with back propagation. Second, Rprop doesn't require you to specify any free parameter values, as opposed to back propagation which needs values for the learning rate (and usually an optional momentum term).
The purpose of the resilient backpropagation (Rprop) training algorithm is to eliminate these harmful effects of the magnitudes of the partial derivatives. Only the sign of the derivative can determine the direction of the weight update; the magnitude of the derivative has no effect on the weight update.
Backpropagation is an algorithm used in machine learning that works by calculating the gradient of the loss function, which points us in the direction of the value that minimizes the loss function. It relies on the chain rule of calculus to calculate the gradient backward through the layers of a neural network.
I think the "over all patterns" simply means "in every iteration"... take a look at the RPROP paper
For the paritial derivative: you've already implemented the normal back-propagation algorithm. This is a method for efficiently calculate the gradient... there you calculate the δ values for the single neurons, which are in fact the negative ∂E/∂w values, i.e. the parital derivative of the global error as function of the weights.
so instead of multiplying the weights with these values, you take one of two constants (η+ or η-), depending on whether the sign has changed
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With