I've been reading up quite a bit on Neural Networks and training them with backprogpagation, primarily this Coursera course, with additional reading from here and here. I thought I had a pretty soild grasp of the core algorithm, but my attempt to build a backpropagation trained neural net hasn't quite worked out and i'm not sure why. The code is in C++ with no vectorisation as of yet. I wanted to build a simple 2 input neurons, 1 hidden neuron, 1 output neuron, network to model the AND function. Just to understand how the concepts worked before moving onto a more complex example My forward propagation code worked when I hand coded in the values for the weights and biases. <pre class="prettyprint"><code>float NeuralNetwork::ForwardPropagte(const float *dataInput) { int number = 0; // Write the input data into the input layer for ( auto & node : m_Network[0]) { node->input = dataInput[number++]; } // For each layer in the network for ( auto & layer : m_Network) { // For each neuron in the layer for (auto & neuron : layer) { float activation; if (layerIndex != 0) { neuron->input += neuron->bias; activation = Sigmoid( neuron->input); } else { activation = neuron->input; } for (auto & pair : neuron->outputNeuron) { pair.first->input += static_cast<float>(pair.second)*activation; } } } return Sigmoid(m_Network[m_Network.size()-1][0]->input); } </code></pre> Some of these variables are fairly poorly named but basically, neuron->outputNeuron is a vector of pairs. The first being a pointer to the next neuron and the second being the weight value. neuron->input is the "z" value in the neural network equation, the sum of all the wieghts*activation + bais. Sigmoid is given by: <pre class="prettyprint"><code>float NeuralNetwork::Sigmoid(float value) const { return 1.0f/(1.0f + exp(-value)); } </code></pre> These two appear to work as intended. After a pass over the network all the 'z' or 'neuron->input' values are reset to zero (or after backpropagation). I then train the network following the psudo-code below. Training code is run multiple times. <pre class="prettyprint"><code>for trainingExample=0 to m // m = number of training examples perform forward propagation to calculate hyp(x) calculate cost delta of last layer delta = y - hyp(x) use the delta of the output to calculate delta for all layers move over the network adjusting the weights based on this value reset network </code></pre> The actual code is here: <pre class="prettyprint"><code>void NeuralNetwork::TrainNetwork(const std::vector<std::pair<std::pair<float,float>,float>> & trainingData) { for (int i = 0; i < 100; ++i) { for (auto & trainingSet : trainingData) { float x[2] = {trainingSet.first.first,trainingSet.first.second}; float y = trainingSet.second; float estimatedY = ForwardPropagte(x); m_Network[m_Network.size()-1][0]->error = estimatedY - y; CalculateError(); RunBackpropagation(); ResetActivations(); } } } </code></pre> With the backpropagation function given by: <pre class="prettyprint"><code>void NeuralNetwork::RunBackpropagation() { for (int index = m_Network.size()-1; index >= 0; --index) { for(auto &node : m_Network[index]) { // Again where the "outputNeuron" is a list of the next layer of neurons and associated weights for (auto &weight : node->outputNeuron) { weight.second += weight.first->error*Sigmoid(node->input); } node->bias = node->error; // I'm not sure how to adjust the bias, some of the formulas seemed to point to this. Is it correct? } } } </code></pre> and the cost calculated by: <pre class="prettyprint"><code>void NeuralNetwork::CalculateError() { for (int index = m_Network.size()-2; index > 0; --index) { for(auto &node : m_Network[index]) { node->error = 0.0f; float sigmoidPrime = Sigmoid(node->input)*(1 - Sigmoid(node->input)); for (auto &weight : node->outputNeuron) { node->error += (weight.first->error*weight.second)*sigmoidPrime; } } } } </code></pre> I randomize the weights and run it on the data set: <pre class="prettyprint"><code> x = {0.0f,0.0f} y =0.0f x = {1.0f,0.0f} y =0.0f x = {0.0f,1.0f} y =0.0f x = {1.0f,1.0f} y =1.0f </code></pre> Of course I shouldn't be training and testing with the same data set but I just wanted to get the basic backpropagation algortithm up and running. When I run this code I see the weights/biases are as follows: <pre class="prettyprint"><code>Layer 0 Bias 0.111129 NeuronWeight 0.058659 Bias -0.037814 NeuronWeight -0.018420 Layer 1 Bias 0.016230 NeuronWeight -0.104935 Layer 2 Bias 0.080982 </code></pre> The training set runs and the mean squared error of delta[outputLayer] looks somthing like: <pre class="prettyprint"><code>Error: 0.156954 Error: 0.152529 Error: 0.213887 Error: 0.305257 Error: 0.359612 Error: 0.373494 Error: 0.374910 Error: 0.374995 Error: 0.375000 ... remains at this value for ever... </code></pre> And the final weights look like: (they always end up at roughtly this value) <pre class="prettyprint"><code>Layer 0 Bias 0.000000 NeuronWeight 15.385233 Bias 0.000000 NeuronWeight 16.492933 Layer 1 Bias 0.000000 NeuronWeight 293.518585 Layer 2 Bias 0.000000 </code></pre> I accept that this may seem like quite a roundabout way of learning neural networks and the implementation is (at the moment) very unoptimal. But can anyone spot any point where I make an invalid assumption, or either the implementation or the formula is wrong? EDIT Thanks for the feedback for the bias values, I stopped them being applied to the input layer and stopped passing the input layer through the sigmoid function. Additionaly my Sigmoid prime function was invalid. But the network still isn't working. I've updated the error and output above with what happens now.

As lejilot said, you have a lot of biases there. You don't need a bias in the last layer, it's an output layer and a bias must be connected to its input, but not to its output. Take a look at the following image: <img src="https://i.stack.imgur.com/HTRwG.png" width="400"> In this image you can see that there is just one bias per layer, except for the last one, where there is no need of a bias. Here you can read a very intuitive approach to neural networks. It is in Python, but it can help you to understand some concepts of neural networks better.

Neural Network Backpropagation implementation issues

Tags:

c++

algorithm

machine-learning

neural-network

I've been reading up quite a bit on Neural Networks and training them with backprogpagation, primarily this Coursera course, with additional reading from here and here. I thought I had a pretty soild grasp of the core algorithm, but my attempt to build a backpropagation trained neural net hasn't quite worked out and i'm not sure why.

The code is in C++ with no vectorisation as of yet.

I wanted to build a simple 2 input neurons, 1 hidden neuron, 1 output neuron, network to model the AND function. Just to understand how the concepts worked before moving onto a more complex example My forward propagation code worked when I hand coded in the values for the weights and biases.

float NeuralNetwork::ForwardPropagte(const float *dataInput)
{
        int number = 0; // Write the input data into the input layer
        for ( auto & node : m_Network[0])
        {
            node->input = dataInput[number++];
        }

        // For each layer in the network
        for ( auto & layer : m_Network)
        {
            // For each neuron in the layer
            for (auto & neuron : layer)
            {
                float activation;
                if (layerIndex != 0)
                {
                   neuron->input += neuron->bias;
                   activation = Sigmoid( neuron->input);
                } else {
                    activation = neuron->input;
                }

                for (auto & pair : neuron->outputNeuron)
                {
                    pair.first->input += static_cast<float>(pair.second)*activation;
                }
            }
        }

        return Sigmoid(m_Network[m_Network.size()-1][0]->input);
}

Some of these variables are fairly poorly named but basically, neuron->outputNeuron is a vector of pairs. The first being a pointer to the next neuron and the second being the weight value. neuron->input is the "z" value in the neural network equation, the sum of all the wieghts*activation + bais. Sigmoid is given by:

float NeuralNetwork::Sigmoid(float value) const
{
    return 1.0f/(1.0f + exp(-value));
}

These two appear to work as intended. After a pass over the network all the 'z' or 'neuron->input' values are reset to zero (or after backpropagation).

I then train the network following the psudo-code below. Training code is run multiple times.

for trainingExample=0 to m // m = number of training examples
   perform forward propagation to calculate hyp(x)
   calculate cost delta of last layer
         delta = y - hyp(x)
   use the delta of the output to calculate delta for all layers
   move over the network adjusting the weights based on this value
   reset network

The actual code is here:

void NeuralNetwork::TrainNetwork(const std::vector<std::pair<std::pair<float,float>,float>> & trainingData)
{
    for (int i = 0; i < 100; ++i)
    {
        for (auto & trainingSet : trainingData)
        {
            float x[2] = {trainingSet.first.first,trainingSet.first.second};
            float y      = trainingSet.second;
            float estimatedY = ForwardPropagte(x);

            m_Network[m_Network.size()-1][0]->error = estimatedY - y;
            CalculateError();
            RunBackpropagation();
            ResetActivations();
        }
    }
}

With the backpropagation function given by:

void NeuralNetwork::RunBackpropagation()
{
    for (int index = m_Network.size()-1; index >= 0; --index)
    {
        for(auto &node : m_Network[index])
        {
            // Again where the "outputNeuron" is a list of the next layer of neurons and associated weights
            for (auto &weight : node->outputNeuron)
            {
                weight.second += weight.first->error*Sigmoid(node->input);
            }
            node->bias = node->error; // I'm not sure how to adjust the bias, some of the formulas seemed to point to this. Is it correct?
        }
    }
}

and the cost calculated by:

void NeuralNetwork::CalculateError()
{
    for (int index = m_Network.size()-2; index > 0; --index)
    {
        for(auto &node : m_Network[index])
        {
            node->error = 0.0f;

            float sigmoidPrime = Sigmoid(node->input)*(1 - Sigmoid(node->input));

            for (auto &weight : node->outputNeuron)
            {
                node->error += (weight.first->error*weight.second)*sigmoidPrime;
            }
        }
    }   
}

I randomize the weights and run it on the data set:

    x = {0.0f,0.0f} y =0.0f
    x = {1.0f,0.0f} y =0.0f
    x = {0.0f,1.0f} y =0.0f
    x = {1.0f,1.0f} y =1.0f

Of course I shouldn't be training and testing with the same data set but I just wanted to get the basic backpropagation algortithm up and running. When I run this code I see the weights/biases are as follows:

Layer 0
    Bias 0.111129
    NeuronWeight 0.058659
    Bias -0.037814
    NeuronWeight -0.018420
Layer 1
    Bias 0.016230
    NeuronWeight -0.104935
Layer 2
    Bias 0.080982

The training set runs and the mean squared error of delta[outputLayer] looks somthing like:

Error: 0.156954
Error: 0.152529
Error: 0.213887
Error: 0.305257
Error: 0.359612
Error: 0.373494
Error: 0.374910
Error: 0.374995
Error: 0.375000

... remains at this value for ever...

And the final weights look like: (they always end up at roughtly this value)

Layer 0
    Bias 0.000000
    NeuronWeight 15.385233
    Bias 0.000000
    NeuronWeight 16.492933
Layer 1
    Bias 0.000000
    NeuronWeight 293.518585
Layer 2
    Bias 0.000000

I accept that this may seem like quite a roundabout way of learning neural networks and the implementation is (at the moment) very unoptimal. But can anyone spot any point where I make an invalid assumption, or either the implementation or the formula is wrong?

EDIT

Thanks for the feedback for the bias values, I stopped them being applied to the input layer and stopped passing the input layer through the sigmoid function. Additionaly my Sigmoid prime function was invalid. But the network still isn't working. I've updated the error and output above with what happens now.

256

asked Nov 17 '15 17:11

Davors72

2 Answers

As lejilot said, you have a lot of biases there. You don't need a bias in the last layer, it's an output layer and a bias must be connected to its input, but not to its output. Take a look at the following image:

In this image you can see that there is just one bias per layer, except for the last one, where there is no need of a bias.

Here you can read a very intuitive approach to neural networks. It is in Python, but it can help you to understand some concepts of neural networks better.

answered Sep 29 '22 04:09

Guillem Cucurull

I solved my issue (beyond the initial biases/signed prime issue above). I started subtracting from, instead of adding to, the weights. In the sources I looked at they had a minus sign inside the delta value calculation which I don't have but I kept their format of adding the negated value to the weights. Additionally, I was puzzled about what to do with the weight and misread one source which said to assign it to the error. I see now the intuition is treating it as a normal weight but multiplying by the bias constant of 1 instead of z. After I added in these changes iterating over the training set ~1000 times could model simple bitwise expressions like OR and AND.

answered Sep 29 '22 05:09

Davors72

Related questions
                            
                                Why is this Rcpp code slower than byte compiled R?
                            
                                switch statement multi character constant
                            
                                boost program_options on/off flag
                            
                                Too many copies when binding variadic template arguments
                            
                                removing a unique_ptr of an object from a vector by an attribute value
                            
                                Converting a struct to char array using memcpy
                            
                                c++ ansi escape codes not displaying color to console
                            
                                Using auto (for iterating) in nested range-based for loop
                            
                                core dumped message is not captured in STDERR
                            
                                QDockWidget::background-color not applied when docked
                            
                                array_view alternative for maps, sets, etc
                            
                                why table and tablebuilder in leveldb use struct rep?
                            
                                unexpected reloc type 0x03
                            
                                Where is the node.js event queue?
                            
                                function pointer assignment and call in c++?
                            
                                Rcpp get element by name - $ operator
                            
                                How should I pass and return unsigned int by value from Java to C/C++ in jna
                            
                                Playing a Sine Wave with changable parameters - how to do the phase shifting?
                            
                                upper_bound and lower_bound inconsistent value requirements
                            
                                Why is std::async slow compared to simple detached threads?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Neural Network Backpropagation implementation issues

Tags:

c++

algorithm

machine-learning

neural-network

Davors72

People also ask

2 Answers

Guillem Cucurull

Davors72

Recent Activity

Donate For Us