What is the right way to preprocess the data in Keras while fine-tuning the pre-trained models in keras.applications for our own data?
Keras provides the following preprocess_input
functions
keras.applications.imagenet_utils.preprocess_input
keras.applications.inception_v3.preprocess_input
keras.applications.xception.preprocess_input
keras.applications.inception_resnet_v2.preprocess_input
Looking inside it seems like for inception_v3, xception, and inception_resnet_v2, it calls keras.applications.imagenet_utils.preprocess_input with mode='tf'
. While for other models it sets mode='caffe'
each of which perform a different transformation.
In the blog post about transfer learning from Francois chollet -- https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html -- it is normalized to [0, 1]
through a division with 255. Shouldn't the preprocess_input functions in Keras be used instead?
Also it is not clear whether the input images should be in RGB or BGR? Is there any consistency regarding this or is it specific to the pre-trained model being used?
1. Very Deep Convolutional Networks for Large-Scale Image Recognition(VGG-16) The VGG-16 is one of the most popular pre-trained models for image classification.
In general, there are two competing criteria while doing any machine learning task in the industry: Accuracy of the model: Higher Better. Speed of Model Training and Predictions: Faster Better.
Using Pretrained Model There are 2 ways to create models in Keras. One is the sequential model and the other is functional API. The sequential model is a linear stack of layers. You can simply keep adding layers in a sequential model just by calling add method.
Always use the preprocess_input
function in the corresponding model-level module. That is, use keras.applications.inception_v3.preprocess_input
for InceptionV3
and keras.applications.resnet50.preprocess_input
for ResNet50
.
The mode
argument specifies the preprocessing method used when training the original model. mode='tf'
means that the pre-trained weights are converted from TF, where the authors trained model with [-1, 1]
input range. So are mode='caffe'
and mode='torch'
.
The input to applications.*.preprocess_input
is always RGB. If a model expects BGR input, the channels will be permuted inside preprocess_input
.
The blog post you've mentioned was posted before the keras.applications
module was introduced. I wouldn't recommend using it as a reference for transfer learning with keras.applications
. Maybe it'll be better to try the examples in the docs instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With