I am confused now about the loss functions used in <code>XGBoost</code>. Here is how I feel confused: <ol> <li>we have <code>objective</code>, which is the loss function needs to be minimized; <code>eval_metric</code>: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification only <code>logloss</code> and <code>mlogloss</code> can be used as <code>eval_metric</code>). Is this correct? If I am, then for a classification problem, how you can use <code>rmse</code> as a performance metric?</li> <li>take two options for <code>objective</code> as an example, <code>reg:logistic</code> and <code>binary:logistic</code>. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, if <code>binary:logistic</code> represents the cross entropy loss function, then what does <code>reg:logistic</code> do?</li> <li>what's the difference between <code>multi:softmax</code> and <code>multi:softprob</code>? Do they use the same loss function and just differ in the output format? If so, that should be the same for <code>reg:logistic</code> and <code>binary:logistic</code> as well, right?</li> </ol> supplement for the 2nd problem say, the loss function for 0/1 classification problem should be <code>L = sum(y_i*log(P_i)+(1-y_i)*log(P_i))</code>. So if I need to choose <code>binary:logistic</code> here, or <code>reg:logistic</code> to let xgboost classifier to use <code>L</code> loss function. If it is <code>binary:logistic</code>, then what loss function <code>reg:logistic</code> uses?

'binary:logistic' uses <code>-(y*log(y_pred) + (1-y)*(log(1-y_pred)))</code> 'reg:logistic' uses <code>(y - y_pred)^2</code> To get a total estimation of error we sum all errors and divide by number of samples. <hr> You can find this in the basics. When looking on Linear regression VS Logistic regression. Linear regression uses <code>(y - y_pred)^2</code> as the Cost Function Logistic regression uses <code>-(y*log(y_pred) + (y-1)*(log(1-y_pred)))</code> as the Cost function <hr> Evaluation metrics are completely different thing. They design to evaluate your model. You can be confused by them because it is logical to use some evaluation metrics that are the same as the loss function, like <code>MSE</code> in regression problems. However, in binary problems it is not always wise to look at the <code>logloss</code>. My experience have thought me (in classification problems) to generally look on <code>AUC ROC</code>. <h3>EDIT</h3> <hr> according to xgboost documentation: <blockquote> reg:linear: linear regression </blockquote> <blockquote> reg:logistic: logistic regression </blockquote> <blockquote> binary:logistic: logistic regression for binary classification, output probability </blockquote> So I'm guessing: reg:linear: is as we said, <code>(y - y_pred)^2</code> reg:logistic is <code>-(y*log(y_pred) + (y-1)*(log(1-y_pred)))</code> and rounding predictions with 0.5 threshhold binary:logistic is plain <code>-(y*log(y_pred) + (1-y)*(log(1-y_pred)))</code> (returns the probability) You can test it out and see if it do as I've edited. If so, I will update the answer, otherwise, I'll just delete it :<

<ol> <li>Yes, a loss function and evaluation metric serve two different purposes. The loss function is used by the model to learn the relationship between input and output. The evaluation metric is used to assess how good the learned relationship is. Here is a link to a discussion of model evaluation: https://scikit-learn.org/stable/modules/model_evaluation.html </li> <li>I'm not sure exactly what you are asking here. Can you clarify this question?</li> </ol>

The loss function and evaluation metric of XGBoost

Tags:

python

machine-learning

xgboost

xgbclassifier

I am confused now about the loss functions used in XGBoost. Here is how I feel confused:

we have objective, which is the loss function needs to be minimized; eval_metric: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification only logloss and mlogloss can be used as eval_metric). Is this correct? If I am, then for a classification problem, how you can use rmse as a performance metric?
take two options for objective as an example, reg:logistic and binary:logistic. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, if binary:logistic represents the cross entropy loss function, then what does reg:logistic do?
what's the difference between multi:softmax and multi:softprob? Do they use the same loss function and just differ in the output format? If so, that should be the same for reg:logistic and binary:logistic as well, right?

supplement for the 2nd problem

say, the loss function for 0/1 classification problem should be L = sum(y_i*log(P_i)+(1-y_i)*log(P_i)). So if I need to choose binary:logistic here, or reg:logistic to let xgboost classifier to use L loss function. If it is binary:logistic, then what loss function reg:logistic uses?

228

asked Nov 29 '18 00:11

Bs He

2 Answers

'binary:logistic' uses -(y*log(y_pred) + (1-y)*(log(1-y_pred)))

'reg:logistic' uses (y - y_pred)^2

To get a total estimation of error we sum all errors and divide by number of samples.

You can find this in the basics. When looking on Linear regression VS Logistic regression.

Linear regression uses (y - y_pred)^2 as the Cost Function

Logistic regression uses -(y*log(y_pred) + (y-1)*(log(1-y_pred))) as the Cost function

Evaluation metrics are completely different thing. They design to evaluate your model. You can be confused by them because it is logical to use some evaluation metrics that are the same as the loss function, like MSE in regression problems. However, in binary problems it is not always wise to look at the logloss. My experience have thought me (in classification problems) to generally look on AUC ROC.

EDIT

according to xgboost documentation:

reg:linear: linear regression

reg:logistic: logistic regression

binary:logistic: logistic regression for binary classification, output probability

So I'm guessing:

reg:linear: is as we said, (y - y_pred)^2

reg:logistic is -(y*log(y_pred) + (y-1)*(log(1-y_pred))) and rounding predictions with 0.5 threshhold

binary:logistic is plain -(y*log(y_pred) + (1-y)*(log(1-y_pred))) (returns the probability)

You can test it out and see if it do as I've edited. If so, I will update the answer, otherwise, I'll just delete it :<

answered Oct 21 '22 05:10

Eran Moshe

Yes, a loss function and evaluation metric serve two different purposes. The loss function is used by the model to learn the relationship between input and output. The evaluation metric is used to assess how good the learned relationship is. Here is a link to a discussion of model evaluation: https://scikit-learn.org/stable/modules/model_evaluation.html
I'm not sure exactly what you are asking here. Can you clarify this question?

answered Oct 21 '22 06:10

Joshua Cook

Related questions
                            
                                python dictionary datetime as key, keyError
                            
                                RabbitMQ closes connection when processing long running tasks and timeout settings produce errors
                            
                                Right way to plot live data with django and bokeh
                            
                                Adding group bar charts as subplots in plotly
                            
                                How to disable pytest plugins for single tests
                            
                                Table legend in matplotlib
                            
                                How to keep track of players' rankings?
                            
                                How can I learn how to implement a custom Python asyncio event loop?
                            
                                Why is pandas '==' different than '.eq()'
                            
                                python SyntaxError: positional argument follows keyword argument [duplicate]
                            
                                How can I build a GUI to use inside a jupyter notebook?
                            
                                matplotlib Slow 3D scatter rotation
                            
                                Selenium driver's page source different than browser
                            
                                Access flask.g inside greenlet
                            
                                Comparison operators vs “rich comparison” methods in Python
                            
                                How to remove In[ ] and Out[ ] cell tags in a Jupyterlab notebook?
                            
                                How are PyTorch's tensors implemented?
                            
                                Read ZIP files from S3 without downloading the entire file
                            
                                Python 2 -> 3 Django migration causes field parameter type change
                            
                                Removing duplicates on very large datasets

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With