<p>I trained GoogLeNet model from scratch. But it didn't give me the promising results.<br> As an alternative, I would like to do fine tuning of GoogLeNet model on my dataset. Does anyone know what are the steps should I follow? </p>

<p>Assuming you are trying to do image classification. These should be the steps for finetuning a model:</p> <h3>1. Classification layer</h3> <p>The original classification layer <code>"loss3/classifier"</code> outputs predictions for 1000 classes (it's <code>mum_output</code> is set to 1000). You'll need to replace it with a new layer with appropriate <code>num_output</code>. Replacing the classification layer:</p> <ol> <li>Change layer's name (so that when you read the original weights from caffemodel file there will be no conflict with the weights of this layer).</li> <li>Change <code>num_output</code> to the right number of output classes you are trying to predict.</li> <li>Note that you need to change ALL classification layers. Usually there is only one, but GoogLeNet happens to have three: <code>"loss1/classifier"</code>, <code>"loss2/classifier"</code> and <code>"loss3/classifier"</code>.</li> </ol> <h3>2. Data</h3> <p>You need to make a new training dataset with the new labels you want to fine tune to. See, for example, this post on how to make an lmdb dataset.</p> <h3>3. How extensive a finetuning you want?</h3> <p>When finetuning a model, you can train ALL model's weights or choose to fix some weights (usually filters of the lower/deeper layers) and train only the weights of the top-most layers. This choice is up to you and it ususally depends on the amount of training data available (the more examples you have the more weights you can afford to finetune).<br> Each layer (that holds trainable parameters) has <code>param { lr_mult: XX }</code>. This coefficient determines how susceptible these weights to SGD updates. Setting <code>param { lr_mult: 0 }</code> means you FIX the weights of this layer and they will not be changed during the training process.<br> Edit your <code>train_val.prototxt</code> accordingly.</p> <h3>4. Run caffe</h3> <p>Run <code>caffe train</code> but supply it with caffemodel weights as an initial weights:</p> <pre class="prettyprint"><code>~$ $CAFFE_ROOT/build/tools/caffe train -solver /path/to/solver.ptototxt -weights /path/to/orig_googlenet_weights.caffemodel </code></pre>

<p>Fine-tuning is a very useful trick to achieve a promising accuracy compared to past manual feature. @Shai already posted a good tutorial for fine-tuning the Googlenet using Caffe, so I just want to give some recommends and tricks for fine-tuning for general cases.</p> <p>In most of time, we face a task classification problem that new dataset (e.g. Oxford 102 flower dataset or Cat&Dog) has following four common situations CS231n:</p> <ol> <li>New dataset is small and similar to original dataset.</li> <li>New dataset is small but is different to original dataset (Most common cases)</li> <li>New dataset is large and similar to original dataset.</li> <li>New dataset is large but is different to original dataset.</li> </ol> <p>In practice, most of time we do not have enough data to train the network from scratch, but may be enough for pre-trained model. Whatever which cases I mentions above only thing we must care about is that do we have enough data to train the CNN?</p> <p>If yes, we can train the CNN from scratch. However, in practice it is still beneficial to initialize the weight from pre-trained model.</p> <p>If no, we need to check whether data is very different from original datasets? If it is very similar, we can just fine-tune the fully connected neural network or fine-tune with SVM. However, If it is very different from original dataset, we may need to fine-tune the convolutional neural network to improve the generalization.</p>

Fine Tuning of GoogLeNet Model

Tags:

machine-learning

deep-learning

computer-vision

caffe

conv-neural-network

I trained GoogLeNet model from scratch. But it didn't give me the promising results.
As an alternative, I would like to do fine tuning of GoogLeNet model on my dataset. Does anyone know what are the steps should I follow?

221

asked Apr 25 '16 12:04

Ashutosh Singla

2 Answers

Assuming you are trying to do image classification. These should be the steps for finetuning a model:

1. Classification layer

The original classification layer "loss3/classifier" outputs predictions for 1000 classes (it's mum_output is set to 1000). You'll need to replace it with a new layer with appropriate num_output. Replacing the classification layer:

Change layer's name (so that when you read the original weights from caffemodel file there will be no conflict with the weights of this layer).
Change num_output to the right number of output classes you are trying to predict.
Note that you need to change ALL classification layers. Usually there is only one, but GoogLeNet happens to have three: "loss1/classifier", "loss2/classifier" and "loss3/classifier".

2. Data

You need to make a new training dataset with the new labels you want to fine tune to. See, for example, this post on how to make an lmdb dataset.

3. How extensive a finetuning you want?

When finetuning a model, you can train ALL model's weights or choose to fix some weights (usually filters of the lower/deeper layers) and train only the weights of the top-most layers. This choice is up to you and it ususally depends on the amount of training data available (the more examples you have the more weights you can afford to finetune).
Each layer (that holds trainable parameters) has param { lr_mult: XX }. This coefficient determines how susceptible these weights to SGD updates. Setting param { lr_mult: 0 } means you FIX the weights of this layer and they will not be changed during the training process.
Edit your train_val.prototxt accordingly.

4. Run caffe

Run caffe train but supply it with caffemodel weights as an initial weights:

~$ $CAFFE_ROOT/build/tools/caffe train -solver /path/to/solver.ptototxt -weights /path/to/orig_googlenet_weights.caffemodel

199

answered Oct 19 '22 20:10

Shai

Fine-tuning is a very useful trick to achieve a promising accuracy compared to past manual feature. @Shai already posted a good tutorial for fine-tuning the Googlenet using Caffe, so I just want to give some recommends and tricks for fine-tuning for general cases.

In most of time, we face a task classification problem that new dataset (e.g. Oxford 102 flower dataset or Cat&Dog) has following four common situations CS231n:

New dataset is small and similar to original dataset.
New dataset is small but is different to original dataset (Most common cases)
New dataset is large and similar to original dataset.
New dataset is large but is different to original dataset.

In practice, most of time we do not have enough data to train the network from scratch, but may be enough for pre-trained model. Whatever which cases I mentions above only thing we must care about is that do we have enough data to train the CNN?

If yes, we can train the CNN from scratch. However, in practice it is still beneficial to initialize the weight from pre-trained model.

If no, we need to check whether data is very different from original datasets? If it is very similar, we can just fine-tune the fully connected neural network or fine-tune with SVM. However, If it is very different from original dataset, we may need to fine-tune the convolutional neural network to improve the generalization.

answered Oct 19 '22 21:10

RyanLiu

Related questions
                            
                                Finding meaningful sub-sentences from a sentence
                            
                                Detecting 'unusual behavior' using machine learning with CouchDB and Python?
                            
                                Appropriate Deep Learning Structure for multi-class classification
                            
                                How to use tf.reset_default_graph()
                            
                                How to use silhouette score in k-means clustering from sklearn library?
                            
                                What is the theorical foundation for scikit-learn dummy classifier?
                            
                                TensorFlow: How to predict from a SavedModel?
                            
                                F1-score per class for multi-class classification
                            
                                Adding gaussian noise to a dataset of floating points and save it (python)
                            
                                Problem with Precision floating point operation in C
                            
                                Algorithm to quickly find animals away from the herd
                            
                                Trajectory Clustering: Which Clustering Method?
                            
                                Object of type 'ndarray' is not JSON serializable
                            
                                sklearn.compose.ColumnTransformer: fit_transform() takes 2 positional arguments but 3 were given
                            
                                Scikit-learn GridSearch giving "ValueError: multiclass format is not supported" error
                            
                                Insert result of sklearn CountVectorizer in a pandas dataframe
                            
                                RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 1)
                            
                                Weak Classifier
                            
                                Keras - How to perform a prediction using KerasRegressor?
                            
                                How do I pass a scalar via a TensorFlow feed dictionary

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With