Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prediction is depending on the batch size in Keras

I am trying to use keras for binary classification of an image.

My CNN model is well trained on the training data (giving ~90% training accuracy and ~93% validation accuracy). But during training if I set the batch size=15000 I get the Figure I output and if I set the batch size=50000 I get Figure II as the output. Can someone please tell what is wrong? The prediction should not depend on batch size right?

Code I am using for prediction :

y=model.predict_classes(patches, batch_size=50000,verbose=1) y=y.reshape((256,256))

Figure 1 Figure 2

My model:-

model = Sequential()

model.add(Convolution2D(32, 3, 3, border_mode='same',
                        input_shape=(img_channels, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])
like image 932
Avijit Dasgupta Avatar asked May 25 '16 06:05

Avijit Dasgupta


People also ask

Does batch size matter for prediction?

The batch size limits the number of samples to be shown to the network before a weight update can be performed. This same limitation is then imposed when making predictions with the fit model. Specifically, the batch size used when fitting your model controls how many predictions you must make at a time.

What is batch size in predict keras?

The default batch size is 32, due to which predictions can be slow. You can specify any batch size you like, in fact it could be as high as 10,000. model.predict(X,batch_size=10,000) Just remember, the larger the batch size, the more data has to be stored in RAM at once. So, try and test what works for your hardware.

Does batch size affect accuracy?

Our results concluded that a higher batch size does not usually achieve high accuracy, and the learning rate and the optimizer used will have a significant impact as well. Lowering the learning rate and decreasing the batch size will allow the network to train better, especially in the case of fine-tuning.

What is predict on batch?

Batch prediction is useful when you want to generate predictions for a set of observations all at once, and then take action on a certain percentage or number of the observations. Typically, you do not have a low latency requirement for such an application.


1 Answers

Keras is standarizing input automaticaly in the predict function. The statistics needed for standarization are computed on a batch - that's why your outputs might depend on a batch size. You may solve this by :

  1. If Keras > 1.0 you could simply define your model in functional API and simpy apply a trained function to self standarized data.
  2. If you have your model trained - you could recover it as Theano function and also apply it to self standarized data.
  3. If your data is not very big you could also simply set your batch size to the number of examples in your dataset.

UPDATE: here is a code for 2nd solution :

import theano

input = model.layers[0].input # Gets input Theano tensor
output = model.layers[-1].output # Gets output Theano tensor
model_theano = theano.function(input, output) # Compiling theano function 

# Now model_theano is a function which behaves exactly like your classifier 

predicted_score = model_theano(example) # returns predicted_score for an example argument

Now if you want to use this new theano_model you should standarize main dataset on your own (e.g. by subtracting mean and dividing by standard deviation every pixel in your image) and apply theano_model to obtain scores for a whole dataset (you could do this in a loop iterating over examples or using numpy.apply_along_axis or numpy.apply_over_axes functions).

UPDATE 2: in order to make this solution working change

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

to:

model.add(Dense(nb_classes, activation = "softmax"))
like image 55
Marcin Możejko Avatar answered Sep 18 '22 13:09

Marcin Możejko