Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras: model accuracy drops after reaching 99 percent accuracy and loss 0.01

I am using an adapted LeNet model in keras to make a binary classification. I have about 250,000 training samples with ratio 60/40. My model is training very well. The first epoch the accuracy reaches 97 percent with a loss of 0.07. After 10 epochs the accuracy is over 99 percent with a loss of 0.01. I am using a CheckPointer to save my models when they improve.

Around the 11th epoch the accuracy drops to around 55 percent with a loss of around 6. How, could this be possible? Is it because the model cannot be more accurate and it tries to find better weights but completely fails to do so?

My model is an adaptation on the LeNet model:

lenet_model = models.Sequential()
lenet_model.add(Convolution2D(filters=filt_size, kernel_size=(kern_size, kern_size), padding='valid',\
                        input_shape=input_shape))
lenet_model.add(Activation('relu'))
lenet_model.add(BatchNormalization())
lenet_model.add(MaxPooling2D(pool_size=(maxpool_size, maxpool_size)))
lenet_model.add(Convolution2D(filters=64, kernel_size=(kern_size, kern_size), padding='valid'))
lenet_model.add(Activation('relu'))
lenet_model.add(BatchNormalization())
lenet_model.add(MaxPooling2D(pool_size=(maxpool_size, maxpool_size)))
lenet_model.add(Convolution2D(filters=128, kernel_size=(kern_size, kern_size), padding='valid'))
lenet_model.add(Activation('relu'))
lenet_model.add(BatchNormalization())
lenet_model.add(MaxPooling2D(pool_size=(maxpool_size, maxpool_size)))
lenet_model.add(Flatten())
lenet_model.add(Dense(1024, kernel_initializer='uniform'))
lenet_model.add(Activation('relu'))
lenet_model.add(Dense(512, kernel_initializer='uniform'))
lenet_model.add(Activation('relu'))
lenet_model.add(Dropout(0.2))
lenet_model.add(Dense(n_classes, kernel_initializer='uniform'))
lenet_model.add(Activation('softmax'))

lenet_model.compile(loss='binary_crossentropy', optimizer=Adam(), metrics=['accuracy'])
like image 812
Wilmar van Ommeren Avatar asked Jun 12 '17 07:06

Wilmar van Ommeren


1 Answers

The problem lied in applying a binary_crossentropy loss whereas in this case categorical_crossentropy should be applied. Another approach is to leave binary_crossentropy loss but to change output to have dim=1 and activation to sigmoid. The weird behaviour comes from the fact that with binary_crossentropy a multiclass binary classification (with two classes) is actually solved whereas your task is a single class binary classification.

like image 67
Marcin Możejko Avatar answered Sep 29 '22 17:09

Marcin Możejko