I'm working on a project with a resnet50 based dual output model. One output is for the regression task and the second output is for a classification task.
My main question is about the model evaluation. During the training, I achieve pretty good results on both ouputs on the validation set:
- Combined loss = 0.32507268732786176
- Val accuracy = 0.97375
- Val MSE: 4.1454763
The model.evaluate gives me the following results on the same set:
- Combined loss = 0.33064378452301024
- Val accuracy = 0.976
- Val MSE = 1.2375486
The model.predict gives me totally differents result (I use scikit-learn to compute the metrics):
- Val accuracy = 0.45875
- Val MSE: 43.555958365743805
These last values changes at each predict execution.
I work on TF2.0. Here is my code:
valid_generator=datagen.flow_from_dataframe(dataframe=df,
directory=PATH,
x_col="X",
y_col=["yReg","yCls"],
class_mode="multi_output",
target_size=(IMG_SIZE, IMG_SIZE),
batch_size=batch_size,
subset="validation",
shuffle=False,
workers = 0)
def generate_data_generator(generator, train=True):
while True:
Xi, yi = train_generator.next()
y2 = []
for e in yi[1]:
y2.append(to_categorical(e, 7))
y2 = np.array(y2)
if train: # Augmentation for training only
Xi = Xi.astype('uint8')
Xi_aug = seq(images=Xi) # imgaug lib needs uint8
Xi_aug = Xi_aug.astype('float32')
Xi_aug = preprocess_input(Xi_aug) # resnet50 preprocessing
yield Xi_aug, [yi[0], y2]
else: # Validation
yield preprocess_input(Xi), [yi[0], y2]
model.fit_generator(generator=generate_data_generator(train_generator, True),
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=generate_data_generator(valid_generator, False),
validation_steps=STEP_SIZE_VALID,
verbose=1,
epochs=50,
callbacks=[checkpoint, tfBoard],
)
evalu = model.evaluate_generator(generate_data_generator(valid_generator, False), steps=STEP_SIZE_VALID)
print(model.metrics_names)
print(evalu)
preds = model.predict_generator(generate_data_generator(valid_generator, False), steps=STEP_SIZE_VALID, workers = 0)
labels = valid_generator.labels
print("MSE error:", me.mean_squared_error(labels[0], preds[0]))
print("Accuracy:", me.accuracy_score(labels[1], preds[1].argmax(axis=1)))
What am I doing wrong ?
Thanks for the help !
The keras. evaluate() function will give you the loss value for every batch. The keras. predict() function will give you the actual predictions for all samples in a batch, for all batches.
predict() is for the actual prediction. It generates output predictions for the input samples. fit() is for training a model. It produces metrics for the training set, where as evaluate() is for a testing the trained model on the test set.
Model. predict passes the input vector through the model and returns the output tensor for each datapoint. Since the last layer in your model is a single Dense neuron, the output for any datapoint is a single value. And since you didn't specify an activation for the last layer, it will default to linear activation.
The model. evaluate() return scalar test loss if the model has a single output and no metrics or list of scalars if the model has multiple outputs and multiple metrics. The attribute model. metrics_names will give you the display labels for the scalar outputs and metrics names.
You are calculating accuracy using just one data point labels[1], preds[1]
instead of all data points. You need to compute accuracy considering all data points to compare the result with model.evaluate_generator
. Also you have computed MSE
on labels[0], preds[0]
data points but accuracy on labels[1], preds[1]
data points. Consider all the data points in both the cases.
Below is an example of binary classification where I am not doing any data augmentation for validation data. You can build the validation generator without Augmentation and set shuffle=False
to generate same batch of data every time, thus you will get same result for model.evaluate_generator
and model.predict_generator
.
Validation Generator -
validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
directory=validation_dir,
shuffle=False,
seed=10,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
Below are the results for accuracy which all match -
model.fit_generator
history = model.fit_generator(
train_data_gen,
steps_per_epoch=total_train // batch_size,
epochs=5,
validation_data=val_data_gen,
validation_steps=total_val // batch_size)
Output -
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Epoch 1/5
20/20 [==============================] - 27s 1s/step - loss: 0.8691 - accuracy: 0.4995 - val_loss: 0.6850 - val_accuracy: 0.5000
Epoch 2/5
20/20 [==============================] - 26s 1s/step - loss: 0.6909 - accuracy: 0.5145 - val_loss: 0.6880 - val_accuracy: 0.5000
Epoch 3/5
20/20 [==============================] - 26s 1s/step - loss: 0.6682 - accuracy: 0.5345 - val_loss: 0.6446 - val_accuracy: 0.6320
Epoch 4/5
20/20 [==============================] - 26s 1s/step - loss: 0.6245 - accuracy: 0.6180 - val_loss: 0.6214 - val_accuracy: 0.5920
Epoch 5/5
20/20 [==============================] - 27s 1s/step - loss: 0.5696 - accuracy: 0.6795 - val_loss: 0.6468 - val_accuracy: 0.6270
model.evaluate_generator
evalu = model.evaluate_generator(val_data_gen)
print(model.metrics_names)
print(evalu)
Output -
['loss', 'accuracy']
[0.646793782711029, 0.6269999742507935]
model.predict_generator
from sklearn.metrics import mean_squared_error, accuracy_score
preds = model.predict_generator(val_data_gen)
y_pred = tf.where(preds<=0.5,0,1)
labels = val_data_gen.labels
y_true = labels
# confusion_matrix(y_true, y_pred)
print("Accuracy:", accuracy_score(y_true, y_pred))
Output -
Accuracy: 0.627
Complete Code for your reference -
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
import os
import numpy as np
import matplotlib.pyplot as plt
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')
train_cats_dir = os.path.join(train_dir, 'cats') # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs') # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats') # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs') # directory with our validation dog pictures
num_cats_tr = len(os.listdir(train_cats_dir))
num_dogs_tr = len(os.listdir(train_dogs_dir))
num_cats_val = len(os.listdir(validation_cats_dir))
num_dogs_val = len(os.listdir(validation_dogs_dir))
total_train = num_cats_tr + num_dogs_tr
total_val = num_cats_val + num_dogs_val
batch_size = 100
epochs = 5
IMG_HEIGHT = 150
IMG_WIDTH = 150
train_image_generator = ImageDataGenerator(rescale=1./255,brightness_range=[0.5,1.5]) # Generator for our training data
validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
directory=validation_dir,
shuffle=False,
seed=10,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
Dense(1)
])
model.compile(optimizer="adam",
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit_generator(
train_data_gen,
steps_per_epoch=total_train // batch_size,
epochs=epochs,
validation_data=val_data_gen,
validation_steps=total_val // batch_size)
evalu = model.evaluate_generator(val_data_gen, steps=total_val // batch_size)
print(model.metrics_names)
print(evalu)
from sklearn.metrics import mean_squared_error, accuracy_score
#val_data_gen.reset()
preds = model.predict_generator(val_data_gen, steps=total_val // batch_size)
y_pred = tf.where(preds<=0.5,0,1)
labels = val_data_gen.labels
y_true = labels
test_labels = []
for i in range(0,10):
test_labels.extend(np.array(val_data_gen[i][1]))
# confusion_matrix(y_true, y_pred)
print("Accuracy:", accuracy_score(test_labels, y_pred))
Also keep in mind, fit_generator
, evaluate_generator
and predict_generator
FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Please use Model.fit, Model.evaluate and Model.predict respectively, which supports generators.
Hope this answers your question. Happy Learning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With