Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is "The more training data the better" true for Neural Networks?

I'm programing a function approximation neural network, that is trying to approximate a very complicated function.

For the training data I generated 1000 random numbers between two limits, I then passed these numbers through a function f(x), and got the outputs.

My neural network aims to approximate the inverse of this function. So, I will use the output of the function as the input training data, and the 1000 random numbers as the output training data.

The problem is that when a random number is put into the function f(x), it is much more likely that the output will be between 0 and 0.01, and very very rare that it will fall outside of this range. Below is a number line, with the 1000 numbers from the output of the function plotted on top of it. As you can see the examples do not uniformly cover the full range of possible numbers.

Distribution of 1000 training examples

To combat this I used a lot of training examples in hope that there will be more examples in the 0.1 to 0.9 range, but this means using a ridiculous number of examples.

So for functions like this, is it just better to use more examples, or are there problems that will arise if you use a huge amount?

like image 552
Blue7 Avatar asked Dec 18 '25 01:12

Blue7


1 Answers

Is it possible trying to fit the logarithm or some logarithm-based transforms of f(x)? It may distribute your output more uniformly.

like image 107
lennon310 Avatar answered Dec 21 '25 05:12

lennon310