Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fine tuning vs Retraining

So I am learning how to use Tensorflow to fine tune the Inception-v3 model for a custom dataset.

I found two tutorials related to this. One was about "How to Retrain Inception's Final Layer for New Categories" and the other was " Train your own image classifier with Inception in TensorFlow with Fine tuning ".

I did the first retraining tutorial on a virtual machine and it took only 2-3 hours to complete. And for the same flowers dataset, I am doing the second fine tuning tutorial on a GPU and it took around one whole day to perform the training.

What is the difference between retraining and fine tuning?

I was under the impression that both involved using a pre-trained Inception v3 model, removing the old top layer and train a new one on the flower photos. But my understanding can be wrong.

like image 628
Nik Avatar asked Mar 09 '23 14:03

Nik


1 Answers

Usually in the ML literature we call fine tuning the process of:

  1. Keep a trained model. Model = feature extractor layers + classification layers
  2. Remove the classification layers
  3. Attach new classification layer
  4. Retrain the whole model end-to-end.

This allow to start from a good configuration of the feature extract layers weights and thus reach an optimum value in a short time.

You can think about the fine tuning like a way to start a new train with a very good initialization method for your weights (although you have to initialize your new classification layers).

When, instead, we talk about retrain of a model, we usually refer to the the process of:

  1. Keep a model architecture
  2. Change the last classification layer in order to produce the amount of classes you want to classify
  3. Train the model end to end.

In this case you don't start from a good starting point as above, but instead you start from a random point in the solution space.

This means that you have to train the model for a longer time because the initial solution is not as good as the initial solution that a pretrained model gives you.

like image 197
nessuno Avatar answered Mar 19 '23 21:03

nessuno