Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

on colab - class_weight is causing a ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

i'm running a CNN with keras sequential on google colab.

i'm getting the following error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

when i remove the class_weight argument from the model.fit function, the error is gone and the network is trained succesfully. however, i really want to account for unbalanced data

i checked the shape of my class_weights vector and it's good (and nd.array, just like you would get when generating class_Weights from sklearn compute class weights function )

not sure what details are relevant but i wil gladly provide more details regarding version and all that mess.

p.s

a fact that might be important - my data is the FER2013 data and i'm using FERplus labels. meaning, my samples are not associated with one unique class, rather each sample has it's own probability distribution for each class. bottom line, my labels are vectors of size class_names with all elements adding up to one.

just to be super clear, an example: img1 label = [0,0,0,0,0.2,0,0.3,0,0,0.5]

anyhow, i computed class_weights as an nd.array of size 10 with elements ranging between 0 and 1, supposed to balance down the more represented classes.

i was not sure if that is relevant to the error, but i'm bringing it up just in case.

my code:

def create_model_plus():
  return tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(filters=32,kernel_size=5,strides=1,input_shape=(48, 48, 1),padding='same',use_bias=True,kernel_initializer='normal',bias_initializer=tf.keras.initializers.Constant(0.1),activation=tf.nn.relu),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2), strides=2),
    tf.keras.layers.Conv2D(filters=64,kernel_size=5,strides=1,padding='same',use_bias=True,kernel_initializer='normal',bias_initializer=tf.keras.initializers.Constant(0.1),activation=tf.nn.relu),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2), strides=1),
    tf.keras.layers.Conv2D(filters=128,kernel_size=5,strides=1,padding='same',use_bias=True,kernel_initializer='normal',bias_initializer=tf.keras.initializers.Constant(0.1),activation=tf.nn.relu),
    tf.keras.layers.BatchNormalization(),    
    tf.keras.layers.MaxPooling2D((2, 2), strides=1),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1008, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
  ])


history_df=[]
history_object=tf.keras.callbacks.History()
#save_best_object=tf.keras.callbacks.ModelCheckpoint('/Users/nimrodros', monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

early_stop_object=tf.keras.callbacks.EarlyStopping(monitor='val_loss',min_delta=0.001, patience=4)
gony_adam=tf.keras.optimizers.Adam(
    lr=0.001
)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.3,patience=3, min_lr=0.0001, verbose=1)

#log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
#tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

datagen = tf.keras.preprocessing.image.ImageDataGenerator(rotation_range=8, width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.3
    )
datagen.fit(images.reshape(28709,48,48,1))
model = create_model_plus()
model.compile(optimizer=gony_adam,
        loss='categorical_crossentropy',
        metrics=['accuracy'])
history = model.fit(x=datagen.flow(images.reshape(28709,48,48,1), FER_train_labels, batch_size=32,subset='training'),validation_data=datagen.flow(images.reshape(28709,48,48,1), FER_train_labels, batch_size=32,subset='validation'),steps_per_epoch=600,validation_steps=250,epochs=60,callbacks=[history_object,early_stop_object,reduce_lr],class_weight=cl_weigh)
history_df=pd.DataFrame(history.history)

hope someone knows what to do! thanks!!!

like image 550
ether212 Avatar asked Apr 16 '20 23:04

ether212


People also ask

How do you fix the truth value of an array with more than one element is ambiguous use a ANY () or a all ()?

Use a. any() or a. all() if you try to get the truth value of a numpy array with more than one element. To solve this error, you can use the any() function if you want at least one element in the array to return True or the all() function if you want all the elements in the array to return True.

What is the truth value of an array?

That x<5 is not actually a boolean value, but an array of 10 bools, indicating which values are under 5. If what the user wants is (x<5). any() or (x<5).


2 Answers

The problem is that the sklearn API returns a numpy array but the keras requires a dictionary as an input for class_weight (see here). You can resolve the error using below method:

from sklearn.utils import class_weight
weight = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)
weight = {i : weight[i] for i in range(5)}
like image 144
Paras Gulati Avatar answered Sep 23 '22 12:09

Paras Gulati


class_weights = sklearn.utils.class_weight.compute_class_weight('balanced', np.unique(labels[i]), labels[i])
class_weights = {l:c for l,c in zip(np.unique(labels[i]), class_weights)}
like image 28
Sahar Millis Avatar answered Sep 25 '22 12:09

Sahar Millis