General Explanation: My codes work fine, but the results are wired. I don't know the problem is with
I am struggling with this error several weeks and so far I have changed the loss function, optimizer, data generator, etc., but I could not solve it. I appreciate any help. If the following information is not enough, let me know, please.
Field of study: I am using tensorflow, keras for multiclass classification. The dataset has 36 binary human attributes. I have used resnet50, then for each part of the body (head, upper body, lower body, shoes, accessories), I have added a separated branch to the network. The network has 1 input image with 36 labels and 36 output nodes (36 denes layers with sigmoid activation).
Problem: The problem is that the accuracy that keras is reporting is high, but f1-score is very low or zero for most of the outputs (even when I use f1-score as a metric when compiling the network, the f1-socre for validation is very bad).
aAfter train, when I use the network in prediction mode, it returns always one/zero for some classes. It means that the network is not able to learn (even when I use weighted loss function or focal loss function.)
Why it is weird? Because, state-of-the-art methods report heigh f1 score even after the first epoch (e.g. https://github.com/chufengt/iccv19_attribute, that I have run it in my PC and got good results after one epoch).
Parts of the Codes:
print("setup model ...")
input_image = KL.Input(args.img_input_shape, name= "input_1")
C1, C2, C3, C4, C5 = resnet_graph(input_image, architecture="resnet50", stage5=False, train_bn=True)
output_layers = merged_model (input_features=C4)
model = Model(inputs=input_image, outputs=output_layers, name='SoftBiometrics_Model')
...
print("model compiling ...")
OPTIM = optimizers.Adadelta(lr=args.learning_rate, rho=0.95)
model.compile(optimizer=OPTIM, loss=binary_focal_loss(alpha=.25, gamma=2), metrics=['acc',get_f1])
plot_model(model, to_file='model.png')
...
img_datagen = ImageDataGenerator(rotation_range=6, width_shift_range=0.03, height_shift_range=0.03, brightness_range=[0.85,1.15], shear_range=0.06, zoom_range=0.09, horizontal_flip=True, preprocessing_function=preprocess_input_resnet, rescale=1/255.)
img_datagen_test = ImageDataGenerator(preprocessing_function=preprocess_input_resnet, rescale=1/255.)
def multiple_outputs(generator, dataframe, batch_size, x_col):
Gen = generator.flow_from_dataframe(dataframe=dataframe,
directory=None,
x_col = x_col,
y_col = args.Categories,
target_size = (args.img_input_shape[0],args.img_input_shape[1]),
class_mode = "multi_output",
classes=None,
batch_size = batch_size,
shuffle = True)
while True:
gnext = Gen.next()
# return image batch and 36 sets of lables
labels = gnext[1]
output_dict = {"{}_output".format(Category): np.array(labels[index]) for index, Category in enumerate(args.Categories)}
yield {'input_1':gnext[0]}, output_dict
trainGen = multiple_outputs (generator = img_datagen, dataframe=Train_df_img, batch_size=args.BATCH_SIZE, x_col="Train_Filenames")
testGen = multiple_outputs (generator = img_datagen_test, dataframe=Test_df_img, batch_size=args.BATCH_SIZE, x_col="Test_Filenames")
STEP_SIZE_TRAIN = len(Train_df_img["Train_Filenames"]) // args.BATCH_SIZE
STEP_SIZE_VALID = len(Test_df_img["Test_Filenames"]) // args.BATCH_SIZE
...
print("Fitting the model to the data ...")
history = model.fit_generator(generator=trainGen,
epochs=args.Number_of_epochs,
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=testGen,
validation_steps=STEP_SIZE_VALID,
callbacks= [chekpont],
verbose=1)
In the Python sci-kit learn library, we can use the F-1 score function to calculate the per class scores of a multi-class classification problem.
Generally, values over 0.7 are considered good scores. BTW, the above formula was for the binary classifiers. For multiclass, Sklearn gives an even more monstrous formula: Image by Sklearn.
Precision for Multi-Class Classification In an imbalanced classification problem with more than two classes, precision is calculated as the sum of true positives across all classes divided by the sum of true positives and false positives across all classes.
There is a possibility that you are passing binary f1-score to compile
function. This should fix the problem -
pip install tensorflow-addons
...
import tensorflow_addons as tfa
f1 = tfa.metrics.F1Score(36,'micro' or 'macro')
model.compile(...,metrics=[f1])
You can read more about how f1-micro and f1-macro is calculated and which can be useful here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With