Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create simple 3-layer neural network and teach it using supervised learning?

Based on PyBrain's tutorials I managed to knock together the following code:

#!/usr/bin/env python2
# coding: utf-8

from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, FullConnection
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer

n = FeedForwardNetwork()

inLayer = LinearLayer(2)
hiddenLayer = SigmoidLayer(3)
outLayer = LinearLayer(1)

n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)

in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)

n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)

n.sortModules()

ds = SupervisedDataSet(2, 1)
ds.addSample((0, 0), (0,))
ds.addSample((0, 1), (1,))
ds.addSample((1, 0), (1,))
ds.addSample((1, 1), (0,))

trainer = BackpropTrainer(n, ds)
# trainer.train()
trainer.trainUntilConvergence()

print n.activate([0, 0])[0]
print n.activate([0, 1])[0]
print n.activate([1, 0])[0]
print n.activate([1, 1])[0]

It's supposed to learn XOR function, but the results seem quite random:

0.208884929522

0.168926515771

0.459452834043

0.424209192223

or

0.84956138664

0.888512762786

0.564964077401

0.611111147862

like image 613
Luke Avatar asked Sep 18 '15 15:09

Luke


People also ask

Can neural networks be used for supervised learning?

Strictly speaking, a neural network (also called an “artificial neural network”) is a type of machine learning model that is usually used in supervised learning.

What is a 3 layer neural network?

The neural network consists of three layers: an input layer, i; a hidden layer, j; and an output layer, k.

What method is used to train a multi layer neural network?

The standard method for training neural networks is the method of stochastic gradient descent (SGD).

How do neural network learn 3?

Neural networks generally perform supervised learning tasks, building knowledge from data sets where the right answer is provided in advance. The networks then learn by tuning themselves to find the right answer on their own, increasing the accuracy of their predictions.


1 Answers

There are four problems with your approach, all easy to identify after reading Neural Network FAQ:

  • Why use a bias/threshold?: you should add a bias node. Lack of bias makes the learning very limited: the separating hyperplane represented by the network can only pass through the origin. With the bias node, it can move freely and fit the data better:

    bias = BiasUnit()
    n.addModule(bias)
    
    bias_to_hidden = FullConnection(bias, hiddenLayer)
    n.addConnection(bias_to_hidden)
    
  • Why not code binary inputs as 0 and 1?: all your samples lay in a single quadrant of the sample space. Move them to be scattered around the origin:

    ds = SupervisedDataSet(2, 1)
    ds.addSample((-1, -1), (0,))
    ds.addSample((-1, 1), (1,))
    ds.addSample((1, -1), (1,))
    ds.addSample((1, 1), (0,))
    

    (Fix the validation code at the end of your script accordingly.)

  • trainUntilConvergence method works using validation, and does something that resembles the early stopping method. This doesn't make sense for such a small dataset. Use trainEpochs instead. 1000 epochs is more than enough for this problem for the network to learn:

    trainer.trainEpochs(1000)
    
  • What learning rate should be used for backprop?: Tune the learning rate parameter. This is something you do every time you employ a neural network. In this case, the value 0.1 or even 0.2 dramatically increases the learning speed:

    trainer = BackpropTrainer(n, dataset=ds, learningrate=0.1, verbose=True)
    

    (Note the verbose=True parameter. Observing how the error behaves is essential when tuning parameters.)

With these fixes I get consistent, and correct results for the given network with the given dataset, and error less than 1e-23.

like image 114
BartoszKP Avatar answered Sep 20 '22 13:09

BartoszKP