I tried out two ways of implementing light GBM. Expect it to return the same value but it didnt.
I thought lgb.LightGBMRegressor()
and lgb.train(train_data, test_data)
will return the same accuracy but it didnt. So I wonder why?
def dataready(train, test, predictvar):
included_features = train.columns
y_test = test[predictvar].values
y_train = train[predictvar].ravel()
train = train.drop([predictvar], axis = 1)
test = test.drop([predictvar], axis = 1)
x_train = train.values
x_test = test.values
return x_train, y_train, x_test, y_test, train
x_train, y_train, x_test, y_test, train2 = dataready(train, test, 'runtime.min')
train_data = lgb.Dataset(x_train, label=y_train)
test_data = lgb.Dataset(x_test, label=y_test)
lgb1 = LMGBRegressor()
lgb1.fit(x_train, y_train)
lgb = lgb.train(parameters,train_data,valid_sets=test_data,num_boost_round=5000,early_stopping_rounds=100)
I expect it to be roughly the same but it is not. As far as I understand, one is a booster and the other is a regressor?
LightGBM is an open-source gradient boosting framework that based on tree learning algorithm and designed to process data faster and provide better accuracy. It can handle large datasets with lower memory usage and supports distributed learning.
LGBMRegressor
is the sklearn interface. The .fit(X, y)
call is standard sklearn syntax for model training. It is a class object for you to use as part of sklearn's ecosystem (for running pipelines, parameter tuning etc.).
lightgbm.train
is the core training API for lightgbm itself.
XGBoost and many other popular ML training libraries have a similar differentiation (core API uses xgb.train(...)
for example with sklearn API using XGBClassifier
or XGBRegressor
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With