Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Sklearn: KNeighborsRegressor vs KNeighborsClassifer

What is the difference between the KNeighborsRegressor and the KNeighborsClassifier of the sklearn library?

I'm trying to use the kNN algorithm to make predictions on a datset that has the names of certain emotions (like happy, sad, angry) as possible classes. The attributes are numerical pixel values. I've learned that these are of the categorical type. I'm using sklearn for the first time and can't decide between the KneighborsRegressor and the KNeighborsClassifier. Is there that much of a difference in my case? In which situations would one use these?

like image 838
AlexT Avatar asked Jan 27 '23 14:01


1 Answers

KNeighborsRegressor and KNeighborsClassifier are closely related. Both retrieve some k neighbors of query objects, and make predictions based on these neighbors. Assume the five nearest neighbors of a query x contain the labels [2, 0, 0, 0, 1]. Let's encode the emotions as happy=0, angry=1, sad=2.

The KNeighborsClassifier essentially performs a majority vote. The prediction for the query x is 0, which means 'happy'. So this is the way to go here.

The KNeighborsRegressor instead computes the mean of the nearest neighbor labels. The prediction would then be 3/5 = 0.6. But this does not map to any emotion we defined. The reason for that is that the emotions variable is indeed categorical, as stated in the question. If you had emotions encoded as a continuous variable, you may use the Regressor. Say the values are in an interval [0.0, 2.0], where 0 means really happy, and 2 means really sad, 0.6 now holds a meaning (happy-ish).

Btw, since you mention logistic-regression in the keywords, don't be confused by the name. It is actually classification, as described in the scikit-learn user guide.

like image 61
rvf Avatar answered Feb 02 '23 16:02