Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neural Network Diverging instead of converging

I have implemented a neural network (using CUDA) with 2 layers. (2 Neurons per layer). I'm trying to make it learn 2 simple quadratic polynomial functions using backpropagation.

But instead of converging, the it is diverging (the output is becoming infinity)

Here are some more details about what I've tried:

  • I had set the initial weights to 0, but since it was diverging I have randomized the initial weights
  • I read that a neural network might diverge if the learning rate is too high so I reduced the learning rate to 0.000001
  • The two functions I am trying to get it to add are: 3 * i + 7 * j+9 and j*j + i*i + 24 (I am giving the layer i and j as input)
  • I had implemented it as a single layer previously and that could approximate the polynomial functions better
  • I am thinking of implementing momentum in this network but I'm not sure it would help it learn
  • I am using a linear (as in no) activation function
  • There is oscillation in the beginning but the output starts diverging the moment any of weights become greater than 1

I have checked and rechecked my code but there doesn't seem to be any kind of issue with it.

So here's my question: what is going wrong here?

Any pointer will be appreciated.

like image 853
Shayan RC Avatar asked Aug 01 '13 04:08

Shayan RC


People also ask

Why does a neural network diverge?

The most obvious reason for a neural network code to diverge is that the coder has forgotten to put the negative sign in the change in weight expression. Another cause of your problem could be that there is a problem with the error expression used for calculating the gradients.

Why is my network not converging?

The amount of the training data is low or the data we are pushing on the model is corrupted or not collected with the data integrity. The activation function we are using with the network often leads to good results from the model but if complexity is higher then the model can fail to converge.

What happens in diverging neural networks?

Divergence allows one neuron to communicate with many other neurons in a network. Convergence allows a neuron to receive input from many neurons in a network.

How do I make my neural network converge faster?

Input normalization This method is also one of the most helpful methods to make neural networks converge faster. In many of the learning processes, we experience faster training when the training data sum to zero. We can normalize the input data by subtracting the mean value from each input variable.


2 Answers

  1. If the problem you are trying to solve is of classification type, try 3 layer network (3 is enough accordingly to Kolmogorov) Connections from inputs A and B to hidden node C (C = A*wa + B*wb) represent a line in AB space. That line divides correct and incorrect half-spaces. The connections from hidden layer to ouput, put hidden layer values in correlation with each other giving you the desired output.

  2. Depending on your data, error function may look like a hair comb, so implementing momentum should help. Keeping learning rate at 1 proved optimum for me.

  3. Your training sessions will get stuck in local minima every once in a while, so network training will consist of a few subsequent sessions. If session exceeds max iterations or amplitude is too high, or error is obviously high - the session has failed, start another.

  4. At the beginning of each, reinitialize your weights with random (-0.5 - +0.5) values.

  5. It really helps to chart your error descent. You will get that "Aha!" factor.

like image 92
Lex Avatar answered Oct 13 '22 06:10

Lex


The most common reason for a neural network code to diverge is that the coder has forgotten to put the negative sign in the change in weight expression.

another reason could be that there is a problem with the error expression used for calculating the gradients.

if these don't hold, then we need to see the code and answer.

like image 24
sidquanto Avatar answered Oct 13 '22 07:10

sidquanto