Custom scikit-learn scorer can't access mean after fit

Tags:

I am trying to create a custom estimator based on scikit learn. I have written the below dummy code to explain my problem. In the score method, I am trying to access mean_ calulated in fit. But I am unable to. What I am doing wrong? I have tried many things and have done this referring three four articles. But didn't find the issue.

I have read the documentation and did few changes. But nothing worked. I have also tried inheriting BaseEstimator, ClassifierMixin. But that also didn't work.

This a dummy program. Don't go by what it is trying to do.

import numpy as np
from sklearn.model_selection import cross_val_score


class FilterElems:
    def __init__(self, thres):
        self.thres = thres

    def fit(self, X, y=None, **kwargs):
        self.mean_ = np.mean(X)
        self.std_ = np.std(X)
        return self

    def predict(self, X):
        #         return sign(self.predict(inputs))
        X = (X - self.mean_) / self.std_
        return X[X > self.thres]

    def get_params(self, deep=False):
        return {'thres': self.thres}

    def score(self, *x):
        print(self.mean_)  # errors out, mean_ and std_ are wiped out
        if len(x[1]) > 50:
            return 1.0
        else:
            return 0.5


model = FilterElems(thres=0.5)
print(cross_val_score(model,
                      np.random.randint(1, 1000, (100, 100)),
                      None,
                      scoring=model.score,
                      cv=5))

Err:

AttributeError: 'FilterElems' object has no attribute 'mean_'

326

asked Feb 19 '20 06:02

ggaurav

1 Answers

You are almost there.

The signature for scorer is scorer(estimator, X, y). The cross_val_score calls the scorer method by passing the estimator object as the first parameter. Since your signature of scorer is a variable argument function, the first item will hold the estimator

change your score to

def score(self, *x):
    print(x[0].mean_)
    if len(x[1]) > 50:
        return 1.0
    else:
        return 0.5

Working code

import numpy as np
from sklearn.model_selection import cross_val_score

class FilterElems:
    def __init__(self, thres):
        self.thres = thres

    def fit(self, X, y=None, **kwargs):
        self.mean_ = np.mean(X)
        self.std_ = np.std(X)
        return self

    def predict(self, X):
        X = (X - self.mean_) / self.std_
        return X[X > self.thres]

    def get_params(self, deep=False):
        return {'thres': self.thres}

    def score(self, estimator, *x):
        print(estimator.mean_, estimator.std_) 
        if len(x[0]) > 50:
            return 1.0
        else:
            return 0.5

model = FilterElems(thres=0.5)
print(cross_val_score(model,
                      np.random.randint(1, 1000, (100, 100)),
                      None,
                      scoring=model.score,
                      cv=5))

Outout

504.750125 288.84916035447355
501.7295 289.47825925231416
503.743375 288.8964170227962
503.0325 287.8292687406025
500.041 289.3488678377712
[0.5 0.5 0.5 0.5 0.5]

answered Sep 17 '22 17:09

mujjiga

Related questions
                            
                                on_epoch_end() not called in keras fit_generator()
                            
                                How to force all strings to floats? [duplicate]
                            
                                How to remap or revert a point into its former coordinate system after warpAffine has transformed it?
                            
                                how to replace just first instance of max value in dataframe pandas?
                            
                                Post-install script with Python Poetry
                            
                                Clean text images with OpenCV for OCR reading
                            
                                Creating subplots with equal axis scale, Python, matplotlib
                            
                                How do I make more efficient code for a search for multiple strings in column in pandas
                            
                                How to get probability of prediction per entity from Spacy NER model?
                            
                                How to find code that is missing type annotations?
                            
                                Multi-Page Dash App Callbacks Not Registering
                            
                                installing spyder_autopep8 on spyder 4 and getting it to work
                            
                                PyOpenGL how do I import an obj file?
                            
                                RuntimeValueProviderError when creating a google cloud dataflow template with Apache Beam python
                            
                                How to run python on GPU with CuPy?
                            
                                Airflow: Proper way to run DAG for each file
                            
                                Value filter in pandas dataframe keeping NaN
                            
                                How to log from a custom ai platform model
                            
                                Does it make sense to use sklearn GridSearchCV together with CalibratedClassifierCV?
                            
                                ValueError: "cannot reindex from a duplicate axis" in groupby Pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Custom scikit-learn scorer can't access mean after fit

Tags:

python

machine-learning

scikit-learn

ggaurav

People also ask

1 Answers

mujjiga

Recent Activity

Donate For Us