Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement different scoring functions in LogisticRegressionCV in scikit-learn?

I'm trying to implement the LogisticRegressionCV class from scikit-learn 0.16 and am having difficult getting it to work with different scoring functions. The docs say to pass in one of the scoring functions from sklearn.metrics so I've tried the following code:

from sklearn.linear_model import LogisticRegressionCV
from sklearn.metrics import log_loss

...

model_regression = LogisticRegressionCV(scoring=log_loss)
model_regression.fit(data_combined, winners_losers)

However I get the following error on the fit function:

  File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line 1381, in fit
    for label in iter_labels
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 659, in __call__
    self.dispatch(function, args, kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 406, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 140, in __init__
    self.results = func(*args, **kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line 844, in _log_reg_scoring_path
    scores.append(scoring(log_reg, X_test, y_test))
  File "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line 1403, in log_loss
    T = lb.fit_transform(y_true)
  File "C:\Anaconda3\lib\site-packages\sklearn\base.py", line 433, in fit_transform
    return self.fit(X, **fit_params).transform(X)
  File "C:\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 315, in fit
    self.y_type_ = type_of_target(y)
  File "C:\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 287, in type_of_target
    'got %r' % y)
ValueError: Expected array-like (array or non-string sequence), got LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr',
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0)

What am I doing wrong here? Without the 'scoring=log_loss' parameter then the function works fine so it has to be something to do with how I'm passing the function?

like image 604
BitOfABeginner Avatar asked Mar 21 '26 08:03

BitOfABeginner


2 Answers

It should be scoring="neg_log_loss", a string, not a function. If you want to pass a callable, it needs to have a different interface. See the docs. A callable should take three arguments: the fitted estimator, the data to score (X) and the known true targets (y).

like image 199
Andreas Mueller Avatar answered Mar 25 '26 00:03

Andreas Mueller


To provide a function, you need the make_scorer wrapper

import sklearn.metrics 

scorefunc = sklearn.metrics.accuracy_score  # Replace with custom
myscorer = sklearn.metrics.make_scorer(
         scorefunc,
         greater_is_better=True,
         needs_threshold=False # ... classification
)

LogisticRegressionCV(... scoring=myscorer,)

.... as side note, it would be great if sklearn's LogisticRegression was primarily regression, and a new LogisticClassification class wrapped this. Its not possible to supply a regression error, or supply a real-valued target at the moment. (AFAIK)

like image 23
user48956 Avatar answered Mar 25 '26 01:03

user48956



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!