I already know how to do it on vgg (fine tuning the last conv block) and inception (fine tuning the top two blocks). I'd like to know which layers is recommended to freeze in order to fine tuning a resnet model.
Freezing reduces training time as the backward passes go down in number. Freezing the layer too early into the operation is not advisable. Freezing all the layers but the last 5 ones, you only need to backpropagate the gradient and update the weights of the last 5 layers.
I think that there is no a state of the art strategy for this but I may share you my thoughts on this topic (names of layers are similar to these presented here:
In case of having a lot of data real-world photos: freeze all stages up to stage 4 (leave the only 5th trainable). If you overfit - make the 5th stage to have fewer layers. If underfit unfreeze a half of the fourth layer. Remember - the deeper into the network - the more ImageNet specific features you have.
In case of having a few real-world photos: cut 5th, leave half of 4th stage trainable and freeze the rest. If overfit - keep cutting stage 4th, if underfit - keep extending.
In case of having a lot of simple photos data (e.g. medical ones) - cut 4th and 5th - leave 3rd trainable and freeze the rest. If overfit - keep cutting - id underfit - try point 2.
In case of having a few simple (less than 10K) photos data - I would advise not to use ResNet50
. From my experience it overfits severely. I usually implement my custom topologies similar to ResNet18
. If you still want to try it - try instructions from 3rd point.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With