Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to correctly train my Neural Network

I'm trying to teach a neural network to decide where to go based on its inputted life level. The neural network will always receive three inputs [x, y, life]. If life => 0.2, it should output the angle from [x, y] to (1, 1). If life < 0.2, it should output the angle from [x, y] to (0, 0).

As the inputs and outputs of neurons should be between 0 and 1, I divide the angle by 2 *Math.PI.

Here is the code:

var network = new synaptic.Architect.Perceptron(3,4,1);

for(var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  for(var j = 0; j < 100; j++){
    network.activate([x,y,j/100]);
    if(j < 20){
      network.propagate(0.3, [angle1]);
    } else {
      network.propagate(0.3, [angle2]);
    }
  }
}

Try it out here: jsfiddle

So when I enter the following input [0, 1, 0.19], I expect the neural network to output something close to [0.75] (1.5PI / 2PI). But my results are completely inconsistent and show no correlation with any input given at all.

What mistake am I making in teaching my Neural network?

I have managed to teach a neural network to output 1 when input [a, b, c] with c => 0.2 and 0 when input [a, b, c] with c < 0.2. I have also managed to teach it to output an angle to a certain location based on [x, y] input, however I can't seem to combine them.


As requested, I have written some code that uses 2 Neural Networks to get the desired output. The first neural network converts life level to a 0 or a 1, and the second neural network outputs an angle depending on the 0 or 1 it got outputted from the first neural network. This is the code:

// This network outputs 1 when life => 0.2, otherwise 0
var network1 = new synaptic.Architect.Perceptron(3,3,1);
// This network outputs the angle to a certain point based on life
var network2 = new synaptic.Architect.Perceptron(3,3,1);

for (var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);

  for(var j = 0; j < 100; j++){
    network1.activate([x,y,j/100]);
    if(j < 20){
      network1.propagate(0.1, [0]);
    } else {
      network1.propagate(0.1, [1]);
    }
     network2.activate([x,y,0]);
    network2.propagate(0.1, [angle1]);
    network2.activate([x,y,1]);
    network2.propagate(0.1, [angle2]);
  }
}

Try it out here: jsfiddle

As you can see in this example. It manages to reach the desired output quite closely, by adding more iterations it will come even closer.

like image 430
Thomas Wagenaar Avatar asked Feb 02 '17 15:02

Thomas Wagenaar


People also ask

What is an appropriate way to train a deep neural network?

Given a particular task, a natural way to train a deep network is to frame it as an optimization problem by specifying a supervised cost function on the output layer with respect to the desired target and use a gradient-based optimization algorithm in order to adjust the weights and biases of the network so that its ...

How long should a neural network be trained?

If you ask me about a tentative time, I would say that it can be anything between 6 months to 1 year. Here are some factors that determine the time taken by a beginner to understand neural networks. However, all courses come with a specified time.


1 Answers

Observations

  1. Skewed Distribution sampled as Training set

    Your training set is choosing the life parameter inside for(var j = 0; j < 100; j++), which is highly biased towards j>20 and consequently life>0.2. It has 4 times more training data for that subset, which makes your training function prioritize.

  2. Non-shuffled training data

    You are training sequentially against the life parameter, which can be harmful. You network will end up giving more attention to the bigger js since it's the most recent reason for network propagations. You should shuffle your training set to avoid this bias.

    This will stack with the previous point, because you're again giving more attention to some subset of life values.

  3. You should measure your training performance as well

    Your network, despite previous observations, was not really that bad. Your training error was not as huge as your tests. This discrepancy usually means that you're training and testing on different sample distributions.

    You could say that you have two classes of data points: the ones with life>0.2 and the others not. But because you introduced a discontinuity in the angleToPoint function, I'd recommend that you separate in three classes: keep a class for life<0.2 (because the function behaves continuously) and split life>0.2 in "above (1,1)" and "below (1,1)."

  4. Network complexity

    You could successfully train a network for each task separately. Now you want to stack them. This is quite the purpose of deep learning: each layer builds on the concepts perceived by the previous layer, therefore increasing the complexity of the concepts it can learn.

    So instead of using 20 nodes in a single layer, I'd recommend that you use 2 layers of 10 nodes. This matches the classes hierarchy I mentioned in the previous point.

The Code

Running this code I had a training/testing error of 0.0004/0.0002.

https://jsfiddle.net/hekqj5jq/11/

var network = new synaptic.Architect.Perceptron(3,10,10,1);
var trainer = new synaptic.Trainer(network);
var trainingSet = [];

for(var i = 0; i < 50000; i++){
  // 1st category: above vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(x, 1.0);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 2nd category: below vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, x);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 3rd category: above/below vector (1,1), measure against (0,0)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, 1.0);
  var z = getRandom(0.0, 0.2);
  var angle = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
}

trainer.train(trainingSet, {
    rate: 0.1,
    error: 0.0001,
    iterations: 50,
    shuffle: true,
    log: 1,
    cost: synaptic.Trainer.cost.MSE
});

testSet = [
    {input: [0,1,0.25], output: [angleToPoint(0, 1, 1, 1) / (2 * Math.PI)]},
    {input: [1,0,0.35], output: [angleToPoint(1, 0, 1, 1) / (2 * Math.PI)]},
    {input: [0,1,0.10], output: [angleToPoint(0, 1, 0, 0) / (2 * Math.PI)]},
    {input: [1,0,0.15], output: [angleToPoint(1, 0, 0, 0) / (2 * Math.PI)]}
];

$('html').append('<p>Train:</p> ' + JSON.stringify(trainer.test(trainingSet)));
$('html').append('<p>Tests:</p> ' + JSON.stringify(trainer.test(testSet)));

$('html').append('<p>1st:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.25]));

$('html').append('<p>2nd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.25]));

$('html').append('<p>3rd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.15]));

$('html').append('<p>4th:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.15]));

function angleToPoint(x1, y1, x2, y2){
  var angle = Math.atan2(y2 - y1, x2 - x1);
  if(angle < 0){
    angle += 2 * Math.PI;
  }
  return angle;
}

function getRandom (min, max) {
    return Math.random() * (max - min) + min;
}

Further Remarks

As I mentioned in the comments and in the chat, there's no such a thing as "angle between (x,y) and (0,0)", because the notion of angle between vectors is usually taken as the difference between their directions and (0,0) has no direction.

Your function angleToPoint(p1, p2) returns instead the direction of (p1-p2). For p2 = (0,0), that means the angle between p1 and the x axis alright. But for p1=(1,1) and p2=(1,0) it will not return 45 degrees. For p1=p2, it's undefined instead of zero.

like image 69
villasv Avatar answered Oct 17 '22 22:10

villasv