Can someone help me how to write custom F1 score for multiclass classification in python???
Edit: I'm editing the question to give a better picture of what I want to do
This is my function for a custom eval f1 score metric for multiclass problem with 5 classes.
def evalerror(preds, dtrain):
labels = dtrain.get_label()
preds = preds.reshape(-1, 5)
preds = preds.argmax(axis = 1)
f_score = f1_score(preds, labels, average = 'weighted')
return 'f1_score', f_score, True
Note: The reason I'm reshaping is the validation true value is of length 252705 whereas the preds is an array of length 1263525 which is 5 times the actual. The reason is LGB outputs the probab of each class for each prediction.
Below I'm converting the train and validation data to the format that LGB will accept.
dtrain = lgb.Dataset(train_X, label= train_Y, free_raw_data = False)
dvalid = lgb.Dataset(valid_X, label= valid_Y, free_raw_data = False,
reference= dtrain)
Below is LGB model I'm fitting to the training data. As you can see I have passed the evalerror
custom function to my model at feval
and the also the validation data dvalid
for which I want to see the f1 score while training.
I'm training the model for 10 iterations.
evals_result = {}
num_round = 10
lgb_model = lgb.train(params,
dtrain,
num_round,
valid_sets = dvalid,
feval = evalerror,
evals_result = evals_result)
As the model is getting trained for 10 rounds the F1 score for each iteration on the validation set is displayed below which is not right as I'm getting around 0.18.
[1] valid_0's multi_logloss: 1.46839 valid_0's f1_score: 0.183719
[2] valid_0's multi_logloss: 1.35684 valid_0's f1_score: 0.183842
[3] valid_0's multi_logloss: 1.26527 valid_0's f1_score: 0.183853
[4] valid_0's multi_logloss: 1.18799 valid_0's f1_score: 0.183909
[5] valid_0's multi_logloss: 1.12187 valid_0's f1_score: 0.187206
[6] valid_0's multi_logloss: 1.06452 valid_0's f1_score: 0.187503
[7] valid_0's multi_logloss: 1.01437 valid_0's f1_score: 0.187327
[8] valid_0's multi_logloss: 0.97037 valid_0's f1_score: 0.187511
[9] valid_0's multi_logloss: 0.931498 valid_0's f1_score: 0.186957
[10] valid_0's multi_logloss: 0.896877 valid_0's f1_score: 0.18751
But once the model is trained for 10 iterations I run the below code to predict on the same validation set.
lgb_prediction = lgb_model.predict(valid_X)
lgb_prediction = lgb_prediction.argmax(axis = 1)
lgb_F1 = f1_score(lgb_prediction, valid_Y, average = 'weighted')
print("The Light GBM F1 is", lgb_F1)
The Light GBM F1 is 0.743250263548
Note: I have not reshaped here like I have done it in the custom function is because lgb_model.predict()
outputs a numpy array of (252705, 5)
Also note that I'm passing the valid_X
and not dvalid
because while predicting we will have to pass the original format not the sparse format like we pass in the lgb.train()
When I predicted on the same validation dataset, I'm getting a F1 score of 0.743250263548 which is good enough. So what I expect is the validation F1 score at the 10th iteration while training should be same as the one I predicted after training the model.
Can someone help me with the what I'm doing wrong. Thanks
I had the same issue.
Lgb predictions are outputed in a flattened array.
By inspecting it, I figured out that it goes like this :
probability of sample a
to class i
is located at
num_classes*(a-1) + i
position
As for your code , it should be like that:
def evalerror(preds, dtrain):
labels = dtrain.get_label()
preds = preds.reshape(5, -1).T
preds = preds.argmax(axis = 1)
f_score = f1_score(labels , preds, average = 'weighted')
return 'f1_score', f_score, True
sklearn.metrics.f1_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)[source]
So according to this you should correct:
#f1_score(labels , preds)
def evalerror(preds, dtrain):
labels = dtrain.get_label()
preds = preds.reshape(-1, 5)
preds = preds.argmax(axis = 1)
f_score = f1_score(labels , preds, average = 'weighted')
return 'f1_score', f_score, True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With