Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LightGBM: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I was running lightgbm with categorical features:

X_train, X_test, y_train, y_test = train_test_split(train_X, train_y, test_size=0.3)

train_data = lgb.Dataset(X_train, label=y_train, feature_name=X_train.columns, 
                                  categorical_feature=cat_features)

test_data = lgb.Dataset(X_test, label=y_train, reference=train_data)

param = {'num_trees': 4000, 'objective':'binary', 'metric': 'auc'}
bst = lgb.train(param, train_data, valid_sets=[test_data], early_stopping_rounds=100)

Turns out the Error:

if self.handle is not None and feature_name is not None and feature_name != 'auto':

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I checked the other similar errors on stackoverflow mostly related to numpy, and I then checked documentation and tried to replace my categorical_feature with index like [0, 2, 5, ...](my original was column names of categorical features), still the same error.

I also tried replacing label with the column index, still error.

Anyone could help? Thanks in advance.

like image 501
MJeremy Avatar asked Jun 22 '18 06:06

MJeremy


People also ask

How do you use a ANY () or a all ()?

Use all() when you need to check a long series of and conditions. Use any() when you need to check a long series of or conditions.

What is the truth value of an array?

ValueError: The truth value of an array with more than one element is ambiguous. If the number of elements is one, the value of the element is evaluated as a bool value. For example, if the element is an integer int , it is False if it is 0 and True otherwise.


1 Answers

I think, there is an issue with the way how you pass feature_name. The constructor expects a list, and oyu pass it pandas.core.indexes.base.Index. The problem is that on such object feature_name != 'auto' condition from the if statement that the error mentions acts element-wise. Thus the or tries to join a bool and numpy.ndarray.

A simple solution would be either to convert to a list (feature_name=X_train.columns.tolist()) or to use feature_name='auto', which will the name extraction from a pd.DataFrame internally

like image 155
Mischa Lisovyi Avatar answered Oct 05 '22 02:10

Mischa Lisovyi