I am trying to use keras for binary classification of an image.
My CNN model is well trained on the training data (giving ~90% training accuracy and ~93% validation accuracy). But during training if I set the batch size=15000 I get the Figure I output and if I set the batch size=50000 I get Figure II as the output. Can someone please tell what is wrong? The prediction should not depend on batch size right?
Code I am using for prediction :
y=model.predict_classes(patches, batch_size=50000,verbose=1)
y=y.reshape((256,256))
My model:-
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(img_channels, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
The batch size limits the number of samples to be shown to the network before a weight update can be performed. This same limitation is then imposed when making predictions with the fit model. Specifically, the batch size used when fitting your model controls how many predictions you must make at a time.
The default batch size is 32, due to which predictions can be slow. You can specify any batch size you like, in fact it could be as high as 10,000. model.predict(X,batch_size=10,000) Just remember, the larger the batch size, the more data has to be stored in RAM at once. So, try and test what works for your hardware.
Our results concluded that a higher batch size does not usually achieve high accuracy, and the learning rate and the optimizer used will have a significant impact as well. Lowering the learning rate and decreasing the batch size will allow the network to train better, especially in the case of fine-tuning.
Batch prediction is useful when you want to generate predictions for a set of observations all at once, and then take action on a certain percentage or number of the observations. Typically, you do not have a low latency requirement for such an application.
Keras is standarizing input automaticaly in the predict
function. The statistics needed for standarization are computed on a batch - that's why your outputs might depend on a batch size. You may solve this by :
UPDATE: here is a code for 2nd solution :
import theano
input = model.layers[0].input # Gets input Theano tensor
output = model.layers[-1].output # Gets output Theano tensor
model_theano = theano.function(input, output) # Compiling theano function
# Now model_theano is a function which behaves exactly like your classifier
predicted_score = model_theano(example) # returns predicted_score for an example argument
Now if you want to use this new theano_model
you should standarize main dataset on your own (e.g. by subtracting mean and dividing by standard deviation every pixel in your image) and apply theano_model
to obtain scores for a whole dataset (you could do this in a loop iterating over examples or using numpy.apply_along_axis
or numpy.apply_over_axes
functions).
UPDATE 2: in order to make this solution working change
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
to:
model.add(Dense(nb_classes, activation = "softmax"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With