Why is a simple 2-layer Neural Network unable to learn 0,0 sequence?

Tags:

While going through the example of a tiny 2-layer neural network I noticed the result that I cannot explain.

Imagine we have the following dataset with the corresponding labels:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]

Let's create a tiny 2-layer NN which will learn to predict the outcome of a two number sequence where each number can be 0 or 1. We shall train this NN given our dataset mentioned above.

    import numpy as np

    # compute sigmoid nonlinearity
    def sigmoid(x):
        output = 1 / (1 + np.exp(-x))
        return output

    # convert output of sigmoid function to its derivative
    def sigmoid_to_deriv(output):
        return output * (1 - output)

    def predict(inp, weigths):
        print inp, sigmoid(np.dot(inp, weigths))

    # input dataset
    X = np.array([ [0,1],
                   [0,1],
                   [1,0],
                   [1,0]])
    # output dataset
    Y = np.array([[0,0,1,1]]).T

    np.random.seed(1)

    # init weights randomly with mean 0
    weights0 = 2 * np.random.random((2,1)) - 1

    for i in xrange(10000):
        # forward propagation
        layer0 = X
        layer1 = sigmoid(np.dot(layer0, weights0))
        # compute the error
        layer1_error = layer1 - Y

        # gradient descent
        # calculate the slope at current x position
        layer1_delta = layer1_error * sigmoid_to_deriv(layer1)
        weights0_deriv = np.dot(layer0.T, layer1_delta)
        # change x by the negative of the slope (x = x - slope)
        weights0 -= weights0_deriv

    print 'INPUT   PREDICTION'
    predict([0,1], weights0)
    predict([1,0], weights0)
    # test prediction of the unknown data
    predict([1,1], weights0)
    predict([0,0], weights0)

After we've trained this NN we test it.

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.5]
[0, 0] [ 0.5]

Ok, 0,1 and 1,0 is what we would expect. The predictions for 0,0 and 1,1 are also explainable, our NN just didn't have the training data for these cases, so let's add it into our training dataset:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]
[0,0] -> [0]
[1,1] -> [1]

Retrain the network and test it again!

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.9898148]
[0, 0] [ 0.5]

Wait, why [0,0] is still 0.5?

This means that NN is still uncertain about 0,0, same when it was uncertain about 1,1 until we trained it.

443

asked Jul 07 '16 14:07

minerals

1 Answers

The classification is right as well. You need to understand that the net was able to separate the test set.

Now You need to use an step function to classify the data between 0 or 1.

In your case the 0.5 seems to be a good threshold

EDIT:

You need to add the bias to the code.

# input dataset
X = np.array([ [0,0,1],
               [0,0,1],
               [0,1,0],
               [0,1,0]])

# init weights randomly with mean 0
weights0 = 2 * np.random.random((3,1)) - 1

196

answered Oct 07 '22 13:10

Alvaro Joao

Related questions
                            
                                How to explicitly set samesite=None on a flask response
                            
                                What is the purpose of the sub-interpreter API in CPython?
                            
                                Why is there {Raw,Safe}ConfigParser in Python 3?
                            
                                How do you correct Module already loaded UserWarnings in Python?
                            
                                Python garbage collection can be that slow?
                            
                                Django development server reload takes too long
                            
                                Using Chrome's cookies in Python-Requests
                            
                                Code in Python, communicate in Node.js and Socket.IO, present in HTML
                            
                                How to handle database exceptions in Django
                            
                                Django Admin + FORCE_SCRIPT_NAME + Login redirects incorrectly
                            
                                How do I properly use Python's C API and exceptions?
                            
                                What magic prevents Tkinter programs from blocking in interactive shell?
                            
                                Compiling mysql-python on Windows with PIP
                            
                                Save the "Out[]" table of a pandas dataframe as a figure
                            
                                Timeseries streaming in bokeh
                            
                                UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4'
                            
                                Python: openpyxl how to read a cell font color
                            
                                Alternative for r's Exponential smoothing state space model in python/scikit/numpy
                            
                                Using Deep Learning to Predict Subsequence from Sequence
                            
                                In Python Dictionaries, how does ( (j*5)+1 ) % 2**i cycle through all 2**i

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is a simple 2-layer Neural Network unable to learn 0,0 sequence?

Tags:

python

machine-learning

neural-network

minerals

People also ask

1 Answers

Alvaro Joao

Recent Activity

Donate For Us