I am seeing a very strange situation. After training a convolutional network I get about 95% accuracy on the validation data. I save the model. Later I restore the model and run validation on the same validation data set. This time I barely get 10% accuracy. I have read the documentation but nothing seems to help. Is there something I am doing wrong?
def build_model_mnist(image_width, image_height, image_depth):
model = keras.Sequential()
model.add(keras.layers.Conv2D(5, (3, 3), activation='relu', input_shape=(image_width, image_height, image_depth)))
model.add(keras.layers.MaxPooling2D((2, 2)))
model.add(keras.layers.Conv2D(10, (3, 3), activation='relu'))
model.add(keras.layers.MaxPooling2D((2, 2)))
model.add(keras.layers.Conv2D(10, (3, 3), activation='relu'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
def train_mnist():
model = build_model_mnist(image_width=train_images.shape[1],
image_height=train_images.shape[2],
image_depth=train_images.shape[3])
# Start training
h = model.fit(train_images, train_labels, batch_size=500, epochs=5)
model.save("minist")
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Accuracy:", test_acc)
train_mnist()
The above will show 95% accuracy. But the code below shows 10% accuracy.
def evaluate_mnist():
# Load the model
model = keras.models.load_model("minist")
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Accuracy:", test_acc)
evaluate_mnist()
If I save and restore just the weights then things work fine. In the code below we are saving the weights only. Later we recreate the model architecture using code and restore the weights. This approach produces the correct accuracy.
def train_mnist():
#Create the network model
model = build_model_mnist(image_width=train_images.shape[1],
image_height=train_images.shape[2],
image_depth=train_images.shape[3])
# Start training
h = model.fit(train_images, train_labels, batch_size=500, epochs=5)
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Accuracy:", test_acc)
model.save_weights("minist-weights")
train_mnist()
def evaluate_mnist():
# Re-create the model architecture
model = build_model_mnist(image_width=train_images.shape[1],
image_height=train_images.shape[2],
image_depth=train_images.shape[3])
model.load_weights("minist-weights")
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Accuracy:", test_acc)
evaluate_mnist()
I had a similar problem in tf 2.3.0.
This issue explains the problem with the generic term "accuracy" metric when using sparse_categorical_crossentropy. On model reloading it associates the wrong accuracy metric. The solution is to explicitly tell keras to use the correct metric instead of letting it infer what is the correct one (bug in it) i.e. compile with metrics='sparse_categorical_accuracy'.
I was using metrics='accuracy' initially as metric in training phase and discovered that only by recompiling the model after reloading it gave back the expected performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With