Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow classification with extremely unbalanced dataset

I am using TensorFlow LinearClassifier and also DNN to classify two - classes dataset.

However, the problem is the dataset contains 96% of Positive output, and 4% of negative output, and my program always return the prediction as Positive. Of course, in this case I will achieved the accuracy of 96%, but it does not make sense at all.

What is the good way to deal with this kind of situation?

like image 955
mamatv Avatar asked Dec 28 '15 21:12

mamatv


1 Answers

You could try changing the cost function so that a false positive output would be penalized more heavily than a false negative.

like image 119
kkawabat Avatar answered Nov 15 '22 09:11

kkawabat