Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weight Initialisation

I plan to use the Nguyen-Widrow Algorithm for an NN with multiple hidden layers. While researching, I found a lot of ambiguities and I wish to clarify them.

The following is pseudo code for the Nguyen-Widrow Algorithm

      Initialize all weight of hidden layers with random values
      For each hidden layer{
          beta = 0.7 * Math.pow(hiddenNeurons, 1.0 / number of inputs);
          For each synapse{
             For each weight{
              Adjust weight by dividing by norm of weight for neuron and * multiplying by beta value
            }
          } 
      }

Just wanted to clarify whether the value of hiddenNeurons is the size of the particular hidden layer, or the size of all the hidden layers within the network. I got mixed up by viewing various sources.

In other words, if I have a network (3-2-2-2-3) (index 0 is input layer, index 4 is output layer), would the value hiddenNeurons be:

NumberOfNeuronsInLayer(1) + NumberOfNeuronsInLayer(2) + NumberOfNeuronsInLaer(3)

Or just

NumberOfNeuronsInLayer(i) , where i is the current Layer I am at

EDIT:

So, the hiddenNeurons value would be the size of the current hidden layer, and the input value would be the size of the previous hidden layer?

like image 254
Goaler444 Avatar asked Dec 03 '12 18:12

Goaler444


2 Answers

The Nguyen-Widrow initialization algorithm is the following :

  1. Initialize all weight of hidden layers with (ranged) random values
  2. For each hidden layer
    2.1 calculate beta value, 0.7 * Nth(#neurons of input layer) root of #neurons of current layer
    2.2 for each synapse
    2.1.1 for each weight
    2.1.2 Adjust weight by dividing by norm of weight for neuron and multiplying by beta value

Encog Java Framework

like image 85
ThiS Avatar answered Sep 20 '22 16:09

ThiS


Sounds to me like you want more precise code. Here are some actual code lines from a project I'm participating to. Hope you read C. It's a bit abstracted and simplified. There is a struct nn, that holds the neural net data. You probably have your own abstract data type.

Code lines from my project (somewhat simplified):

float *w = nn->the_weight_array;
float factor = 0.7f * powf( (float) nn->n_hidden, 1.0f / nn->n_input);

for( w in all weight )
    *w++ = random_range( -factor, factor );

/* Nguyen/Widrow */
w = nn->the_weight_array;
for( i = nn->n_input; i; i-- ){
    _scale_nguyen_widrow( factor, w, nn->n_hidden );
    w += nn->n_hidden;
}

Functions called:

static void _scale_nguyen_widrow( float factor, float *vec, unsigned int size )
{
    unsigned int i;
    float magnitude = 0.0f;
    for ( i = 0; i < size; i++ )
        magnitude += vec[i] * vec[i];

    magnitude = sqrtf( magnitude );

    for ( i = 0; i < size; i++ )
         vec[i] *= factor / magnitude;
}

static inline float random_range( float min, float max)
{
    float range = fabs(max - min);
    return ((float)rand()/(float)RAND_MAX) * range + min;
}

Tip:
After you've implemented the Nguyen/Widrow weight initialization, you can actually add a little code line in the forward calculation that dumps each activation to a file. Then you can check how good the set of neurons hits the activation function. Find the mean and standard deviation. You can even plot it with a plotting tool, ie. gnuplot. (You need a plotting tool like gnuplot anyway for plotting error rates etc.) I did that for my implementation. The plots came out nice, and the initial learning became much faster using Nguyen/Widrow for my project.

PS: I'm not sure my implementation is correct according to Nguyen and Widrows intentions. I don't even think I care, as long as it does improve the initial learning.

Good luck,
-Øystein

like image 36
Øystein Schønning-Johansen Avatar answered Sep 22 '22 16:09

Øystein Schønning-Johansen