What is the difference between model.LGBMRegressor.fit(x_train, y_train) and lightgbm.train(train_data, valid_sets = test_data)?

Function to break the data

def dataready(train, test, predictvar):
    included_features = train.columns
    y_test = test[predictvar].values
    y_train = train[predictvar].ravel()
    train = train.drop([predictvar], axis = 1)
    test = test.drop([predictvar], axis = 1)
    x_train = train.values
    x_test = test.values
    return x_train, y_train, x_test, y_test, train

This is how i break down the data

x_train, y_train, x_test, y_test, train2 = dataready(train, test, 'runtime.min')
train_data = lgb.Dataset(x_train, label=y_train)
test_data = lgb.Dataset(x_test, label=y_test)

predict model

lgb1 = LMGBRegressor()
lgb1.fit(x_train, y_train)
lgb = lgb.train(parameters,train_data,valid_sets=test_data,num_boost_round=5000,early_stopping_rounds=100)

I expect it to be roughly the same but it is not. As far as I understand, one is a booster and the other is a regressor?

470

asked Jul 15 '19 20:07

Hang Nguyen

1 Answers

LGBMRegressor is the sklearn interface. The .fit(X, y) call is standard sklearn syntax for model training. It is a class object for you to use as part of sklearn's ecosystem (for running pipelines, parameter tuning etc.).

lightgbm.train is the core training API for lightgbm itself.

XGBoost and many other popular ML training libraries have a similar differentiation (core API uses xgb.train(...) for example with sklearn API using XGBClassifier or XGBRegressor).

answered Oct 04 '22 17:10

Michael Xu

Related questions
                            
                                Asyncio decode utf-8 with StreamReader
                            
                                argparse: some mutually exclusive arguments in required group
                            
                                Pandas map column in place
                            
                                Why are python static/class method not callable?
                            
                                Loading Images in a Directory As Tensorflow Data set
                            
                                '{0}'.format() is faster than str() and '{}'.format() using IPython %timeit and otherwise using pure Python
                            
                                Using the URLconf defined in mysite.urls, Django tried these URL patterns, in this order:
                            
                                PyCharm - Expected type 'Optional[IO[str]]', got 'TextIOWrapper[str]' instead
                            
                                What is the different between the get logger functions from celery.utils.log and logging?
                            
                                How to convert Python numpy array to base64 output
                            
                                What is the difference between a statement and a function in Python?
                            
                                How to control when to compute evaluation vs training using the Estimator API of tensorflow?
                            
                                Why changing start method to 'spawn' from 'fork' in Python multiprocessing does not allow me run my job anymore?
                            
                                Curious memory consumption of pandas.unique()
                            
                                Realtime offline speech recognition in Python
                            
                                Extract Python function source text from the source code string
                            
                                Memoization of method working on python 3.6 but not on 3.7.3
                            
                                Memory leaks when using pandas_udf and Parquet serialization?
                            
                                What Does Django static(settings.STATIC_URL, document_root=settings.STATIC_ROOT) Actually DO?
                            
                                What has to be inside tf.distribute.Strategy.scope()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between model.LGBMRegressor.fit(x_train, y_train) and lightgbm.train(train_data, valid_sets = test_data)?

Tags:

python

machine-learning

data-science

lightgbm