Custom sklearn Regressor: Cannot clone object... as the constructor does not seem to set parameter

Question

I'm trying to implement my own kernel regression compatible with sklearn library. My implementation is the following:

import numpy as np
from sklearn.base import BaseEstimator, ClassifierMixin, TransformerMixin, RegressorMixin
from sklearn.utils.validation import check_X_y, check_array, check_is_fitted
from sklearn.utils.multiclass import unique_labels
from sklearn.metrics import euclidean_distances
import models.kernel as ker
        
        
class MyKerReg(BaseEstimator, RegressorMixin):
    
    def __init__ (self, kernel = "gaussian", bandwidth = 1.0):
        self.kernel = ker.kernel(kernel)
        self.bandwidth = bandwidth
  
        
    def fit(self, X, y):
        
        X, y = check_X_y(X, y, accept_sparse=True, ensure_2d=False)
        self.is_fitted_ = True
        self.X_ = X
        self.y_ = y
        
        return self
        
    def predict(self, X):
        
        X = check_array(X, accept_sparse=True, ensure_2d=False)
        check_is_fitted(self, 'is_fitted_')
        
        pred = []
        for x in X:
            tmp = [x - v for v in self.X_]
            ker_values = [(1/self.bandwidth)*self.kernel(v/self.bandwidth) for v in tmp]
            
            ker_values = np.array(ker_values)
            values = np.array(self.y_)
            
            num = np.dot(ker_values.T, values)
            denom = np.sum(ker_values)
            
            pred.append(num/denom)
        return pred

When I call the function predict stand alone all is working well. When is used this object in the cross_val_score like this ...


    y, x = misc.data_generating_process(1000)
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 44)
    
    kr = ker_reg.MyKerReg(kernel = "gaussian", bandwidth = 0.5)
    
    print(cross_val_score(kr, x_train, y_train, scoring="neg_mean_squared_error", cv=5))

... i get the following error:

Exception has occurred: RuntimeError
Cannot clone object MyKerReg(bandwidth=0.5, kernel=<models.kernel.kernel object at 0x7fab359bc940>), as the constructor either does not set or modifies parameter kernel

During handling of the above exception, another exception occurred:

  File "/home/dragos/Projects/ML_Homework/kernel_regression/main.py", line 24, in main
    print(cross_val_score(kr, x_train, y_train, scoring="neg_mean_squared_error", cv=5))
  File "/home/dragos/Projects/ML_Homework/kernel_regression/main.py", line 85, in <module>
    main()

Anyone has any idea on how to fix this? I know there is a similar tread on this topic I can't still figure it out. Thank you all.

I've already read the documentation and articles on the topic and It seems like I'm doing everything right.

Ben Reiniger · Accepted Answer

The __init__ method should set its parameters as attributes, with no name changes or validation. In your example, self.kernel = ker.kernel(kernel) is to blame. You can probably move that into the beginning of fit instead: leave just self.kernel = kernel in init, and self.kernel_ = ker.kernel(self.kernel) in fit.

From the developer guide:

every keyword argument accepted by __init__ should correspond to an attribute on the instance. Scikit-learn relies on this to find the relevant attributes to set on an estimator when doing model selection.

[...]

There should be no logic, not even input validation, and the parameters should not be changed. The corresponding logic should be put where the parameters are used, typically in fit.

Custom sklearn Regressor: Cannot clone object... as the constructor does not seem to set parameter

Tags:

python

scikit-learn

Dragos Tanasa

1 Answers

Ben Reiniger

Recent Activity

Donate For Us

Custom sklearn Regressor: Cannot clone object... as the constructor does not seem to set parameter

Tags:

python

scikit-learn

Dragos Tanasa

1 Answers

Ben Reiniger

Related questions

Recent Activity

Donate For Us