I am confused now about the loss functions used in XGBoost
. Here is how I feel confused:
objective
, which is the loss function needs to be minimized; eval_metric
: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification only logloss
and mlogloss
can be used as eval_metric
). Is this correct? If I am, then for a classification problem, how you can use rmse
as a performance metric?objective
as an example, reg:logistic
and binary:logistic
. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, if binary:logistic
represents the cross entropy loss function, then what does reg:logistic
do?multi:softmax
and multi:softprob
? Do they use the same loss function and just differ in the output format? If so, that should be the same for reg:logistic
and binary:logistic
as well, right?supplement for the 2nd problem
say, the loss function for 0/1 classification problem should be
L = sum(y_i*log(P_i)+(1-y_i)*log(P_i))
. So if I need to choose binary:logistic
here, or reg:logistic
to let xgboost classifier to use L
loss function. If it is binary:logistic
, then what loss function reg:logistic
uses?
XGBoost minimizes a regularized (L1 and L2) objective function that combines a convex loss function (based on the difference between the predicted and target outputs) and a penalty term for model complexity (in other words, the regression tree functions).
The eval_metric parameter determines the metrics that will be used to evaluate the model at each iteration, not to guide optimization. They are only reported and are not used to guide the CV optimization AFAIK.
Evaluation metric is a metric “we want” to minimize or maximize through the modeling process, while loss function is a metric “the model will” minimize through the model training. Giving an example in simple logistic regression: Loss function is the quantity which the model will minimize over the training.
XGBoost Loss for Regression The XGBoost objective function used when predicting numerical values is the “reg:squarederror” loss function. “reg:squarederror”: Loss function for regression predictive modeling problems.
'binary:logistic' uses -(y*log(y_pred) + (1-y)*(log(1-y_pred)))
'reg:logistic' uses (y - y_pred)^2
To get a total estimation of error we sum all errors and divide by number of samples.
You can find this in the basics. When looking on Linear regression VS Logistic regression.
Linear regression uses (y - y_pred)^2
as the Cost Function
Logistic regression uses -(y*log(y_pred) + (y-1)*(log(1-y_pred)))
as the Cost function
Evaluation metrics are completely different thing. They design to evaluate your model. You can be confused by them because it is logical to use some evaluation metrics that are the same as the loss function, like MSE
in regression problems. However, in binary problems it is not always wise to look at the logloss
. My experience have thought me (in classification problems) to generally look on AUC ROC
.
according to xgboost documentation:
reg:linear: linear regression
reg:logistic: logistic regression
binary:logistic: logistic regression for binary classification, output probability
So I'm guessing:
reg:linear: is as we said, (y - y_pred)^2
reg:logistic is -(y*log(y_pred) + (y-1)*(log(1-y_pred)))
and rounding predictions with 0.5 threshhold
binary:logistic is plain -(y*log(y_pred) + (1-y)*(log(1-y_pred)))
(returns the probability)
You can test it out and see if it do as I've edited. If so, I will update the answer, otherwise, I'll just delete it :<
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With