While going through the example of a tiny 2-layer neural network I noticed the result that I cannot explain.
Imagine we have the following dataset with the corresponding labels:
[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]
Let's create a tiny 2-layer NN which will learn to predict the outcome of a two number sequence where each number can be 0 or 1. We shall train this NN given our dataset mentioned above.
import numpy as np
# compute sigmoid nonlinearity
def sigmoid(x):
output = 1 / (1 + np.exp(-x))
return output
# convert output of sigmoid function to its derivative
def sigmoid_to_deriv(output):
return output * (1 - output)
def predict(inp, weigths):
print inp, sigmoid(np.dot(inp, weigths))
# input dataset
X = np.array([ [0,1],
[0,1],
[1,0],
[1,0]])
# output dataset
Y = np.array([[0,0,1,1]]).T
np.random.seed(1)
# init weights randomly with mean 0
weights0 = 2 * np.random.random((2,1)) - 1
for i in xrange(10000):
# forward propagation
layer0 = X
layer1 = sigmoid(np.dot(layer0, weights0))
# compute the error
layer1_error = layer1 - Y
# gradient descent
# calculate the slope at current x position
layer1_delta = layer1_error * sigmoid_to_deriv(layer1)
weights0_deriv = np.dot(layer0.T, layer1_delta)
# change x by the negative of the slope (x = x - slope)
weights0 -= weights0_deriv
print 'INPUT PREDICTION'
predict([0,1], weights0)
predict([1,0], weights0)
# test prediction of the unknown data
predict([1,1], weights0)
predict([0,0], weights0)
After we've trained this NN we test it.
INPUT PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.5]
[0, 0] [ 0.5]
Ok, 0,1
and 1,0
is what we would expect. The predictions for 0,0
and 1,1
are also explainable, our NN just didn't have the training data for these cases, so let's add it into our training dataset:
[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]
[0,0] -> [0]
[1,1] -> [1]
Retrain the network and test it again!
INPUT PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.9898148]
[0, 0] [ 0.5]
This means that NN is still uncertain about 0,0
, same when it was uncertain about 1,1
until we trained it.
Neural networks CAN fail to learn a function; this is most often caused by employing a network topology which is too simple to model the necessary function.
Try a random shuffle of the training set (without breaking the association between inputs and outputs) and see if the training loss goes down. Finally, the best way to check if you have training set issues is to use another training set.
Why does a neural net fail to converge? Most of the neural network fails to converge because of an error in the modelling. Let us say the data is required to transform within the network and the nodes we have provided in the networks are way smaller in number.
The Perceptron — The Oldest & Simplest Neural Network It is also the simplest neural network. Developed by Frank Rosenblatt, the perceptron set the groundwork for the fundamentals of neural networks. This neural network has only one neuron, making it extremely simple.
The classification is right as well. You need to understand that the net was able to separate the test set.
Now You need to use an step function to classify the data between 0
or 1
.
In your case the 0.5
seems to be a good threshold
EDIT:
You need to add the bias to the code.
# input dataset
X = np.array([ [0,0,1],
[0,0,1],
[0,1,0],
[0,1,0]])
# init weights randomly with mean 0
weights0 = 2 * np.random.random((3,1)) - 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With