I have trained a lightgbm model and I would like to plot the learning curves. How can I do that? In Keras for examples history returns the metrics so that I can plot them once training is over. How this task is handled here?
My code is the following:
def f_lgboost(data, params):
model = lgb.LGBMClassifier(**params)
X_train = data['X_train']
y_train = data['y_train']
X_dev = data['X_dev']
y_dev = data['y_dev']
X_test = data['X_test']
categorical_feature= ['Ticker_code', 'Category_code']
X_train[categorical_feature] = X_train[categorical_feature].astype('category')
X_dev[categorical_feature] = X_dev[categorical_feature].astype('category')
X_test[categorical_feature] = X_test[categorical_feature].astype('category')
feature_name = X_train.columns.to_list()
model.fit(X_train, y_train, eval_set = [(X_dev, y_dev)], eval_metric = 'auc', early_stopping_rounds = 20,
categorical_feature = categorical_feature, feature_name = feature_name)
y_pred_train = model.predict_proba(X_train)[:, 1].ravel()
y_pred_dev = model.predict_proba(X_dev)[:, 1].ravel()
from sklearn.metrics import roc_auc_score
auc_train = roc_auc_score(y_train, y_pred_train)
auc_dev = roc_auc_score(y_dev, y_pred_dev)
from sklearn.metrics import precision_recall_fscore_support
precision, recall ,fscore, support = precision_recall_fscore_support(y_dev, (y_pred_dev > 0.5).astype(int), beta=0.5)
y_pred_test = model.predict_proba(X_test)[:, 1].ravel()
print(f'auc_train: {auc_train}, auc_dev : {auc_dev}, precision : {precision}, recall: {recall}, fscore : {fscore}')
Results = {
'params' : params,
'data' : data,
'lg_boost_model' : bst,
'y_pred_train' : y_pred_train,
'y_pred_dev' : y_pred_dev,
'y_pred_test' : y_pred_test,
'auc_train' : auc_train,
'auc_dev' : auc_dev,
'precision_dev': precision,
'recall_dev' : recall,
'fscore_dev' : fscore,
'support_dev' : support
}
return Results
Coding an LGBM in PythonThe LGBM model can be installed by using the Python pip function and the command is “pip install lightbgm” LGBM also has a custom API support in it and using it we can implement both Classifier and regression algorithms where both the models operate in a similar fashion.
Learning curves show the relationship between training set size and your chosen evaluation metric (e.g. RMSE, accuracy, etc.) on your training and validation sets. They can be an extremely useful tool when diagnosing your model performance, as they can tell you whether your model is suffering from bias or variance.
In the scikit-learn API, the learning curves are available via attribute lightgbm.LGBMModel.evals_result_
. They will include metrics computed with datasets specified in the argument eval_set
of method fit
(so you would normally want to specify there both the training and the validation sets). There is also built-in plotting function, lightgbm.plot_metric
, which accepts model.evals_result_
or model
directly.
Here is a complete minimal example:
import lightgbm as lgb
import sklearn.datasets, sklearn.model_selection
X, y = sklearn.datasets.load_boston(return_X_y=True)
X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=7054)
model = lgb.LGBMRegressor(objective='mse', seed=8798, num_threads=1)
model.fit(X_train, y_train, eval_set=[(X_val, y_val), (X_train, y_train)], verbose=10)
lgb.plot_metric(model)
Here is the resulting plot:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With