Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do transfer learning for MNIST dataset?

I have been trying to use transfer learning for MNIST dataset using VGG/Inception. But both of these networks accept images of atleast 224x224x3 size. How can i rescale the 28x28x1 MNIST images to 224x224x3 to do transfer learing?

like image 241
user1159517 Avatar asked Dec 17 '17 06:12

user1159517


People also ask

How long does it take to train on MNIST?

Quick refresher into what the MNIST Dataset comprises of: It's a dataset of classified handwritten images in a 28 x 28 format. Long story short, I was able to build a classifier that was able to detect the test set with an accuracy of ~98%. It took about twenty minutes to train on my MacBook.

What can I learn about MNIST dataset?

The MNIST dataset is an acronym that stands for the Modified National Institute of Standards and Technology dataset. It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9.

What is transfer learning in AlexNet?

Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually much faster and easier than training a network with randomly initialized weights from scratch.


2 Answers

A common way to do what you're asking is to simply resize the images to the desired resolution required for the input layer into the CNN. Because you've tagged your question with keras, keras has a preprocessing module that allows you to load in images and optionally specify the desired size you want to scale the image by. If you look at the actual source of the method: https://github.com/keras-team/keras/blob/master/keras/preprocessing/image.py#L321, it internally uses Pillow interpolation methods to rescale the image to the desired resolution.

In addition, because the MNIST digits are originally grayscale, you will need to replicate the single channel image into a multi-channel image so that it artificially becomes RGB. This means that the red, green and blue channels are all the same and is the MNIST grayscale counterpart. The load_img method has the additional flag called grayscale, and you can set that to False to load in the image as a RGB image.

Once you load these images in converted to RGB and rescaled, you can go ahead and perform Transfer Learning with VGG19. In fact, it has been done before. Consult this link here: https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/ and look at Section 6: Use the pre-trained model for identifying digits.

I'd like to give you fair warning that taking a 28 x 28 image and resizing to a 224 x 224 image will have severe interpolation artifacts. You would perform transfer learning on image data that would contain noise due to upsampling but that's what was done in the blog post I linked earlier. I would recommend you change the interpolation to something like bilinear or bicubic. The default is to use nearest neighbour, which is terrible for upsampling images.

YMMV, so try resizing the image to the desired size of the input layer as well as pad the image with three channels to make it RGB and see what happens.

like image 54
rayryeng Avatar answered Sep 21 '22 14:09

rayryeng


This greatly depends on the model you wish to use. In case of VGGNet, you have to do rescaling of the input to the expected target size, because VGG network contains FC layer, which shape matches the image dimensions after certain number of downsamples. Note that convolutional layers can take any image size due to parameter sharing.

However, modern CNNs are following the trend of switching to all-convolutional and solve the problem of arbitrary transfer learning. If you choose this path, take one of the latest Inception models. In this case, out-of-the model model should be able to accept even small 28x28x1 images.

like image 40
Maxim Avatar answered Sep 23 '22 14:09

Maxim