Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neural Network: Handling unavailable inputs (missing or incomplete data) [closed]

Hopefully the last NN question you'll get from me this weekend, but here goes :)

Is there a way to handle an input that you "don't always know"... so it doesn't affect the weightings somehow?

Soo... if I ask someone if they are male or female and they would not like to answer, is there a way to disregard this input? Perhaps by placing it squarely in the centre? (assuming 1,0 inputs at 0.5?)

Thanks

like image 553
Micheal Avatar asked Apr 08 '10 23:04

Micheal


People also ask

Can neural networks handle missing data?

Abstract: While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network models, are unable to handle these missing values directly.

Is it possible to train a neural network without the missing values?

You can train on this data (just keep the missing dimensions on zero, or try to put in the mean instead of 0.0), only it depends completely on the data if correct predictions can be made. The only way to find out is by training the neural network and evaluating it.

What is the consequence of missing values in neural network modeling?

Brown and Kros [2003] present a comprehensive summary of the past research conducted on the topic of the impact of missing data on various data mining techniques including neural networks and pointed out that missing values can cause variance understatement, distortion of distribution, and correlation depression in the ...

Can deep learning handle missing values?

In addition, in the last few years, deep learning has been extensively used in different fields, including missing data imputation, which has led to a significant improvement of the imputation performance through using a large amount of training data.


2 Answers

Neural networks are fairly resistant to noise - that's one of their big advantages. You may want to try putting inputs at (-1.0,1.0) instead, with 0 as the non-input input, though. That way the input to the weights from that neuron is 0.0, meaning that no learning will occur there.

Probably the best book I've ever had the misfortune of not finishing (yet!) is Neural Networks and Learning Machines by Simon S. Haykin. In it, he talks about all kinds of issues, including the way you should distribute your inputs/training set for the best training, etc. It's a really great book!

like image 188
Daniel G Avatar answered Dec 04 '22 20:12

Daniel G


You probably know this or suspect it, but there's no statistical basis for guessing or supplying the missing values by averaging over the range of possible values, etc.

For NN in particular, there are quite a few techniques avaialble. The technique i use--that i've coded--is one of the simpler techniques, but it has a solid statistical basis and it's still used today. The academic paper that describes it here.

The theory that underlies this technique is weighted integration over the incomlete data. In practice, no integrals are evaluated, instead they are approximated by closed-form solutions of Gaussian Basis Function networks. As you'll see in the paper (which is a step-by-step explanation, it's simple to implement in your backprop algorithm.

like image 42
doug Avatar answered Dec 04 '22 21:12

doug