Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to incorporate uncertainty of features into machine learning algorithms?

I am using decision trees from Scikit Learn to do regression on a data set. I am getting very good results, but one issue that concerns me is that the relative uncertainty on many of the features is very high.
I have tried just dropping the cases with high uncertainty, but that reduces the performance of the model significantly.

The features themselves are experimentally determined, so they have associated experimental uncertainty. The data itself is not noisy.

So my question, is there a good way to incorporate the uncertainty associated with the features to machine learning algorithms?

Thanks for all the help!

like image 421
Nuke_scientist Avatar asked Oct 29 '22 04:10

Nuke_scientist


1 Answers

If the uncertain features are improving the algorithm that suggests that together, they are useful. However, some of them may not be. My suggestion would be to get rid of those features that don't improve the algorithm. You could use a greedy feature elimination algorithm.

http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html

This begins by training a model on all the features in the model and then gets rid of the feature deemed to be the least useful. It trains the model again but with one less feature.

Hope that helps

like image 157
Daniel Wyatt Avatar answered Nov 15 '22 07:11

Daniel Wyatt