Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fine-tuning and transfer learning by the example of YOLO

I have a general question regarding fine-tuning and transfer learning, which came up when I tried to figure out how to best get yolo to detect my custom object (being hands).

I apologize for the long text possibily containing lots of false information. I would be glad if someone had the patience to read it and help me clear my confusion.

After lots of googling, I learned that many people regard fine-tuning to be a sub-class of transfer learning while others believe that they are to different approaches to training a model. At the same time, people differentiate between re-training only the last classifier layer of a model on a custom dataset vs. also re-training other layers of the model (and possbibly adding an enirely new classifier instead of retraining?). Both approaches use pre-trained models.

My final confusien lies here: I followed these instructions: https://github.com/thtrieu/darkflow to train tiny yolo via darkflow, using the command:

# Initialize yolo-new from yolo-tiny, then train the net on 100% GPU: flow --model cfg/yolo-new.cfg --load bin/tiny-yolo.weights --train --gpu 1.0 But what happens here? I suppose I only retrain the classifier because the instructions say to change the number of classes in the last layer in the configuration file. But then again, it is also required to change the number of filters in the second last layer, a convolutional layer.

Lastly, the instructions provide an example of an alternative training: # Completely initialize yolo-new and train it with ADAM optimizer flow --model cfg/yolo-new.cfg --train --trainer adam and I don't understand at all how this relates to the different ways of transfer learning.

like image 647
kaktus Avatar asked Mar 12 '19 10:03

kaktus


2 Answers

If you are using AlexeyAB's darknet repo (not darkflow), he suggests to do Fine-Tuning instead of Transfer Learning by setting this param in cfg file : stopbackward=1 .

Then input ./darknet partial yourConfigFile.cfg yourWeightsFile.weights outPutName.LastLayer# LastLayer# such as :

./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.81 81

It will create yolov3.conv.81 and will freeze the lower layer, then you can train by using weights file yolov3.conv.81 instead of original darknet53.conv.74.

References : https://github.com/AlexeyAB/darknet#how-to-improve-object-detection , https://groups.google.com/forum/#!topic/darknet/mKkQrjuLPDU

like image 58
gameon67 Avatar answered Sep 27 '22 18:09

gameon67


I have not worked on YOLO but looking at your problems I think I can help. Fine tuning, re-training, post-tuning are all somewhat ambiguous terms often used interchangeably. It's all about how much you want to change the pre-trained weights. Since you are loading the weights in the first case with --load, the pre-trained weights are being loaded here - it could mean you are adjusting the weights a bit with a low learning rate or maybe not changing them at all. In the second case, however, you are not loading any weights, so probably you are training it from scratch. So when you make small (fine) changes, call it fine-tuning, post-tuning would be tuning again after initial training, maybe not as fine as fine-tuning and retraining would then be training the whole network or a part again

There would be separate ways in which you can freeze some layers optionally.

like image 25
deadcode Avatar answered Sep 27 '22 18:09

deadcode