I have a multilayer perceptron with a sigmoid loss (tf.nn.sigmoid_cross_entropy_with_logits
) and an Adam optimizer (tf.train.AdamOptimizer
). My input data has several features and some nan feature-values. When I replace the nan values with 0, I get a result, however, when I do not replace the nan values, I get loss=nan.
What is the best way to handle nan values in TensorFlow, and how can I use my input data with nan values without replacing them with 0?
Because most of the machine learning models that you want to use will provide an error if you pass NaN values into it. The easiest way is to just fill them up with 0, but this can reduce your model accuracy significantly.
Pandas treat None and NaN as essentially interchangeable for indicating missing or null values. To facilitate this convention, there are several useful functions for detecting, removing, and replacing null values in Pandas DataFrame : isnull() notnull()
There are 2 primary ways of handling missing values: Deleting the Missing values. Imputing the Missing Values.
How can I somehow tell my network to ignore some input data. For example when the input data is nan
This is very similar to adding a mask to your input data. You want your input data to pass through, nans turned to zeros, but you want somehow to also signal to the neural network to ignore where the nans were and pay attention to everything else.
In this question about adding a mask I review how a mask can successfully be added to an image but also give a code demonstration for a non-image problem.
The code in the masking question shows that when the mask is added the neural net is able to learn well and when the mask is not added it is not able to learn well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With