So I am learning how to use Tensorflow to fine tune the Inception-v3 model for a custom dataset.
I found two tutorials related to this. One was about "How to Retrain Inception's Final Layer for New Categories" and the other was " Train your own image classifier with Inception in TensorFlow with Fine tuning ".
I did the first retraining tutorial on a virtual machine and it took only 2-3 hours to complete. And for the same flowers dataset, I am doing the second fine tuning tutorial on a GPU and it took around one whole day to perform the training.
What is the difference between retraining and fine tuning?
I was under the impression that both involved using a pre-trained Inception v3 model, removing the old top layer and train a new one on the flower photos. But my understanding can be wrong.
Usually in the ML literature we call fine tuning the process of:
This allow to start from a good configuration of the feature extract layers weights and thus reach an optimum value in a short time.
You can think about the fine tuning like a way to start a new train with a very good initialization method for your weights (although you have to initialize your new classification layers).
When, instead, we talk about retrain of a model, we usually refer to the the process of:
In this case you don't start from a good starting point as above, but instead you start from a random point in the solution space.
This means that you have to train the model for a longer time because the initial solution is not as good as the initial solution that a pretrained model gives you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With