Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mape eval metric in xgboost

Tags:

python

xgboost

i'm trying to use MAPE as eval metric in xgboost, but get strange results:

def xgb_mape(preds, dtrain):
   labels = dtrain.get_label()
   return('mape', np.mean(np.abs((labels - preds) / (labels+1))))

xgp = {"colsample_bytree": 0.9, 
   "min_child_weight": 24, 
   "subsample": 0.9, 
   "eta": 0.05, 
   "objective": "reg:linear", 
   "seed": 70}

cv = xgb.cv(params = xgp, 
        dtrain = xgb.DMatrix(train_set[cols_to_use], label=train_set.y),
        folds = KFold(n = len(train_set), n_folds=4, random_state = 707, shuffle=True),
        feval = xgb_mape,
        early_stopping_rounds=10,
        num_boost_round=1000,
        verbose_eval=10,
        maximize=False
        )

It returns:

[0]     train-mape:0.780683+0.00241932  test-mape:0.779896+0.0024619
[10]    train-mape:0.84939+0.0196102    test-mape:0.858054+0.0184669
[20]    train-mape:1.0778+0.0313676     test-mape:1.10751+0.0293785
[30]    train-mape:1.26066+0.0343771    test-mape:1.30707+0.0323237
[40]    train-mape:1.37713+0.0347438    test-mape:1.43339+0.030565
[50]    train-mape:1.45653+0.042433     test-mape:1.52176+0.0383677
[60]    train-mape:1.52268+0.0386395    test-mape:1.5909+0.0353497
[70]    train-mape:1.5636+0.0383622     test-mape:1.63482+0.0301809
[80]    train-mape:1.59408+0.0378158    test-mape:1.66748+0.0315529
[90]    train-mape:1.61712+0.0403532    test-mape:1.69134+0.0325177
[100]   train-mape:1.63028+0.0389446    test-mape:1.70578+0.0316045
[110]   train-mape:1.63556+0.0375842    test-mape:1.71153+0.031564
[120]   train-mape:1.63509+0.0393198    test-mape:1.7117+0.0320471

Train and test results increases with maximize=False, also early_stopping doesnt work properly. Where is error?

UPD. added -1* to xgb_mape, it solved problem. Looks like maximize parameter doesn't work properly for custom eval functions.

like image 720
Slavka Avatar asked Feb 20 '18 10:02

Slavka


People also ask

What is eval metric in XGBoost?

eval_metric [default according to objective] Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, mean average precision for ranking)

What is D matrix in XGBoost?

DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data.

What is Reg_lambda in XGBoost?

reg_alpha (alias: alpha ): it is the L1 regularization parameter, increasing its value makes the model more conservative. Default is 0. reg_lambda (alias: lambda ): L2 regularization parameter, increasing its value also makes the model conservative. Default is 1.

What is objective function in XGBoost?

The XGBoost objective function used when predicting numerical values is the “reg:squarederror” loss function. “reg:squarederror”: Loss function for regression predictive modeling problems.


1 Answers

According to this xgboost example of implementing Average Precision metric, since the xgb optimizer only minimizes, if you implement a metric that maximizes, you have to add a negative sign (-) in front of it, like so:

def pr_auc_metric(y_predicted, y_true):
    return 'pr_auc', -skmetrics.average_precision_score(y_true.get_label(), y_predicted)

So yours would be:

def xgb_mape(preds, dtrain):
   labels = dtrain.get_label()
   return('mape', -np.mean(np.abs((labels - preds) / (labels + 1))))
like image 113
data_steve Avatar answered Oct 27 '22 06:10

data_steve