Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write custom F1 score metric in light gbm python in Multiclass classification

Can someone help me how to write custom F1 score for multiclass classification in python???

Edit: I'm editing the question to give a better picture of what I want to do

This is my function for a custom eval f1 score metric for multiclass problem with 5 classes.

def evalerror(preds, dtrain):
    labels = dtrain.get_label()
    preds = preds.reshape(-1, 5)
    preds = preds.argmax(axis = 1)
    f_score = f1_score(preds, labels, average = 'weighted')
    return 'f1_score', f_score, True

Note: The reason I'm reshaping is the validation true value is of length 252705 whereas the preds is an array of length 1263525 which is 5 times the actual. The reason is LGB outputs the probab of each class for each prediction.

Below I'm converting the train and validation data to the format that LGB will accept.

dtrain = lgb.Dataset(train_X, label= train_Y, free_raw_data = False)
dvalid = lgb.Dataset(valid_X, label= valid_Y, free_raw_data = False, 
                     reference= dtrain)

Below is LGB model I'm fitting to the training data. As you can see I have passed the evalerror custom function to my model at feval and the also the validation data dvalid for which I want to see the f1 score while training. I'm training the model for 10 iterations.

evals_result = {}
num_round = 10
lgb_model = lgb.train(params, 
                      dtrain, 
                      num_round, 
                      valid_sets = dvalid, 
                      feval = evalerror,
                      evals_result = evals_result)

As the model is getting trained for 10 rounds the F1 score for each iteration on the validation set is displayed below which is not right as I'm getting around 0.18.

[1]     valid_0's multi_logloss: 1.46839        valid_0's f1_score: 0.183719
[2]     valid_0's multi_logloss: 1.35684        valid_0's f1_score: 0.183842
[3]     valid_0's multi_logloss: 1.26527        valid_0's f1_score: 0.183853
[4]     valid_0's multi_logloss: 1.18799        valid_0's f1_score: 0.183909
[5]     valid_0's multi_logloss: 1.12187        valid_0's f1_score: 0.187206
[6]     valid_0's multi_logloss: 1.06452        valid_0's f1_score: 0.187503
[7]     valid_0's multi_logloss: 1.01437        valid_0's f1_score: 0.187327
[8]     valid_0's multi_logloss: 0.97037        valid_0's f1_score: 0.187511
[9]     valid_0's multi_logloss: 0.931498       valid_0's f1_score: 0.186957
[10]    valid_0's multi_logloss: 0.896877       valid_0's f1_score: 0.18751

But once the model is trained for 10 iterations I run the below code to predict on the same validation set.

lgb_prediction = lgb_model.predict(valid_X)
lgb_prediction = lgb_prediction.argmax(axis = 1)
lgb_F1 = f1_score(lgb_prediction, valid_Y, average = 'weighted')
print("The Light GBM F1 is", lgb_F1)

The Light GBM F1 is 0.743250263548

Note: I have not reshaped here like I have done it in the custom function is because lgb_model.predict() outputs a numpy array of (252705, 5) Also note that I'm passing the valid_X and not dvalid because while predicting we will have to pass the original format not the sparse format like we pass in the lgb.train()

When I predicted on the same validation dataset, I'm getting a F1 score of 0.743250263548 which is good enough. So what I expect is the validation F1 score at the 10th iteration while training should be same as the one I predicted after training the model.

Can someone help me with the what I'm doing wrong. Thanks

like image 819
Thanish Avatar asked Jul 02 '18 15:07

Thanish


2 Answers

I had the same issue.

Lgb predictions are outputed in a flattened array.

By inspecting it, I figured out that it goes like this :

probability of sample a to class i is located at

num_classes*(a-1) + i position

As for your code , it should be like that:

    def evalerror(preds, dtrain):

        labels = dtrain.get_label()
        preds = preds.reshape(5, -1).T
        preds = preds.argmax(axis = 1)
        f_score = f1_score(labels , preds,  average = 'weighted')
        return 'f1_score', f_score, True
like image 141
Igor Avatar answered Sep 21 '22 13:09

Igor


sklearn.metrics.f1_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)[source]

So according to this you should correct:

#f1_score(labels , preds)
def evalerror(preds, dtrain):
    labels = dtrain.get_label()
    preds = preds.reshape(-1, 5)
    preds = preds.argmax(axis = 1)
    f_score = f1_score(labels , preds,  average = 'weighted')
    return 'f1_score', f_score, True
like image 41
H.Bukhari Avatar answered Sep 22 '22 13:09

H.Bukhari