I am trying to implement my own loss function for binary classification. To get started, I want to reproduce the exact behavior of the binary objective. In particular, I want that:
None of this is the case for the code below:
import sklearn.datasets
import lightgbm as lgb
import numpy as np
X, y = sklearn.datasets.load_iris(return_X_y=True)
X, y = X[y <= 1], y[y <= 1]
def loglikelihood(labels, preds):
preds = 1. / (1. + np.exp(-preds))
grad = preds - labels
hess = preds * (1. - preds)
return grad, hess
model = lgb.LGBMClassifier(objective=loglikelihood) # or "binary"
model.fit(X, y, eval_set=[(X, y)], eval_metric="binary_logloss")
lgb.plot_metric(model.evals_result_)
With objective="binary":
With objective=loglikelihood the slope is not even smooth:
Moreover, sigmoid has to be applied to model.predict_proba(X) to get probabilities for loglikelihood (as I have figured out from https://github.com/Microsoft/LightGBM/issues/2136).
Is it possible to get the same behavior with a custom loss function? Does anybody understand where all these differences come from?
Looking at the output of model.predict_proba(X)
in each case, we can see that the built-in binary_logloss model returns probabilities, while the custom model returns logits.
The built-in evaluation function takes probabilities as input. To fit the custom objective, we need a custom evaluation function which will take logits as input.
Here is how you could write this. I've changed the sigmoid calculation so that it doesn't overflow if logit is a large negative number.
def loglikelihood(labels, logits):
#numerically stable sigmoid:
preds = np.where(logits >= 0,
1. / (1. + np.exp(-logits)),
np.exp(logits) / (1. + np.exp(logits)))
grad = preds - labels
hess = preds * (1. - preds)
return grad, hess
def my_eval(labels, logits):
#numerically stable logsigmoid:
logsigmoid = np.where(logits >= 0,
-np.log(1 + np.exp(-logits)),
logits - np.log(1 + np.exp(logits)))
loss = (-logsigmoid + logits * (1 - labels)).mean()
return "error", loss, False
model1 = lgb.LGBMClassifier(objective='binary')
model1.fit(X, y, eval_set=[(X, y)], eval_metric="binary_logloss")
model2 = lgb.LGBMClassifier(objective=loglikelihood)
model2.fit(X, y, eval_set=[(X, y)], eval_metric=my_eval)
Now the results are the same.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With