Predict with sklearn-KNN using median (instead of mean)

1 Answers

There is no built-in parameter to adjust the weighting to use the median rather than the mean (you can see in the source that the mean is hard-coded). But because scikit-learn estimators are just Python classes, you can subclass KNeighborsRegressor and override the predict method to do whatever you want.

Here's a quick example, where I've copied and pasted the original predict() method and modified the relevant piece:

from sklearn.neighbors.regression import KNeighborsRegressor, check_array, _get_weights

class MedianKNNRegressor(KNeighborsRegressor):
    def predict(self, X):
        X = check_array(X, accept_sparse='csr')

        neigh_dist, neigh_ind = self.kneighbors(X)

        weights = _get_weights(neigh_dist, self.weights)

        _y = self._y
        if _y.ndim == 1:
            _y = _y.reshape((-1, 1))

        ######## Begin modification
        if weights is None:
            y_pred = np.median(_y[neigh_ind], axis=1)
        else:
            # y_pred = weighted_median(_y[neigh_ind], weights, axis=1)
            raise NotImplementedError("weighted median")
        ######### End modification

        if self._y.ndim == 1:
            y_pred = y_pred.ravel()

        return y_pred    

X = np.random.rand(100, 1)
y = 20 * X.ravel() + np.random.rand(100)
clf = MedianKNNRegressor().fit(X, y)
print(clf.predict(X[:5]))
# [  2.38172861  13.3871126    9.6737255    2.77561858  17.07392584]

I've left out the weighted version, because I don't know of a simple way to compute a weighted median with numpy/scipy, but it would be straightforward to add in once that function is available.

106

answered Sep 30 '22 19:09

jakevdp

Related questions
                            
                                How to trigger Python script on Raspberry Pi from Node-Red
                            
                                Python Scipy: scipy.stats.spearmanr returning nans
                            
                                Uninstall and re-install pip package from python module
                            
                                How to connect to remote machine via WinRM in Python (pywinrm) using domain account?
                            
                                Select batch of rows sqlalchemy mysql
                            
                                Return std and confidence intervals for out-of-sample prediction in StatsModels
                            
                                from matplotlib import style ImportError: cannot import name 'style'
                            
                                python map exception continue mapping execution
                            
                                ipython on MacOS 10.10 - command not found
                            
                                Python, Matplotlib, Scatter plot, Change color on the clicked point
                            
                                In the Django Admin Site, how can I access model properties through an Inline?
                            
                                Encoding error using df.to_csv()
                            
                                What is the equivalent to scala.util.Try in pyspark?
                            
                                WTForms SelectField not properly coercing for booleans
                            
                                Extracting whole words based on substring matching in python
                            
                                Oracle 11g - query appears to cache even with NOCACHE hint
                            
                                Unicode character not in range when calling locale.strxfrm
                            
                                Maintain order when dumping dict to JSON
                            
                                Indices that intersect and sort two numpy arrays
                            
                                Pandas dataframe from nested dictionary

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Predict with sklearn-KNN using median (instead of mean)

Tags:

python

scikit-learn

knn

Eugene Yan

People also ask

1 Answers

jakevdp

Recent Activity

Donate For Us