I have been playing around with neural networks for quite a while now, and recently came across the terms "freezing" & "unfreezing" the layers before training a neural network while reading about transfer learning & am struggling with understanding their usage. <ul> <li>When is one supposed to use freezing/unfreezing? </li> <li>Which layers are to freezed/unfreezed? For instance, when I import a pre-trained model & train it on my data, is my entire neural-net except the output layer freezed? </li> <li>How do I determine if I need to unfreeze? </li> <li>If so how do I determine which layers to unfreeze & train to improve model performance? </li> </ul>

By freezing it means that the layer will not be trained. So, its weights will not be changed. Why do we need to freeze such layers? Sometimes we want to have deep enough NN, but we don't have enough time to train it. That's why use pretrained models that already have usefull weights. The good practice is to freeze layers from top to bottom. For examle, you can freeze 10 first layers or etc. <hr> For instance, when I import a pre-trained model & train it on my data, is my entire neural-net except the output layer freezed? - Yes, that's may be a case. But you can also don't freeze a few layers above the last one. How do I freeze and unfreeze layers? - In keras if you want to freeze layers use: <code>layer.trainable = False</code> And to unfreeze: <code>layer.trainable = True</code> If so how do I determine which layers to unfreeze & train to improve model performance? - As I said, the good practice is from top to bottom. You should tune the number of frozen layers by yourself. But take into account that the more unfrozen layers you have, the slower is training.

What is freezing/unfreezing a layer in neural networks?

Tags:

machine-learning

neural-network

tensorflow

deep-learning

transfer-learning

I have been playing around with neural networks for quite a while now, and recently came across the terms "freezing" & "unfreezing" the layers before training a neural network while reading about transfer learning & am struggling with understanding their usage.

When is one supposed to use freezing/unfreezing?
Which layers are to freezed/unfreezed? For instance, when I import a pre-trained model & train it on my data, is my entire neural-net except the output layer freezed?
How do I determine if I need to unfreeze?
If so how do I determine which layers to unfreeze & train to improve model performance?

352

asked Jun 06 '20 08:06

Nizam

2 Answers

I would just add to the other answer that this is most commonly used with CNNs and the amount of layers that you want to freeze (not train) is "given" by the amount of similarity between the task that you are solving and the original one (the one that the original network is solving).

If the tasks are very similar, let's say that you are using CNN pretrained on imagenet and you just want to add some other "general" objects that the network should recognize then you might get away with training just the dense top of the network.

The more dissimilar the tasks are, the more layers of the original network you will need to unfreeze during the training.

163

answered Oct 10 '22 15:10

Matus Dubrava

By freezing it means that the layer will not be trained. So, its weights will not be changed.

Why do we need to freeze such layers?

Sometimes we want to have deep enough NN, but we don't have enough time to train it. That's why use pretrained models that already have usefull weights. The good practice is to freeze layers from top to bottom. For examle, you can freeze 10 first layers or etc.

For instance, when I import a pre-trained model & train it on my data, is my entire neural-net except the output layer freezed?
- Yes, that's may be a case. But you can also don't freeze a few layers above the last one.

How do I freeze and unfreeze layers?
- In keras if you want to freeze layers use: layer.trainable = False
And to unfreeze: layer.trainable = True

If so how do I determine which layers to unfreeze & train to improve model performance?
- As I said, the good practice is from top to bottom. You should tune the number of frozen layers by yourself. But take into account that the more unfrozen layers you have, the slower is training.

answered Oct 10 '22 15:10

Yoskutik

Related questions
                            
                                Does tensorflow propagate gradients through a pdf
                            
                                Using LSTM to predict a simple synthetic time series. Why is it that bad?
                            
                                how to randomly initialize weights in tensorflow?
                            
                                Custom logging handlers in TensorFlow 1.8
                            
                                Extract features into a dataset from keras model
                            
                                Save tensors as images in TensorFlow
                            
                                Issues with using == condition in tf.where()
                            
                                Merging layers on Keras (dot product)
                            
                                TPU slower than GPU?
                            
                                Tensorflow to Keras: import graph def error on Keras model
                            
                                How to add recurrent dropout to CuDNNGRU or CuDNNLSTM in Keras
                            
                                Training, Validation, Testing Batch Size Ratio
                            
                                Viewing Graph from saved .pbtxt file on Tensorboard
                            
                                How to use only one GPU for tensorflow session?
                            
                                cannot install tensorflow-text using pip despite having tensorflow 2.0.0-beta1 installed
                            
                                How to create federated dataset from a CSV file?
                            
                                How to parse the heatmap output for the pose estimation tflite model?
                            
                                How to replace loss function during training tensorflow.keras
                            
                                WARNING: WARNING:tensorflow:Model was constructed with shape (None, 150) , but it was called on an input with incompatible shape (None, 1)
                            
                                padding='same' conversion to PyTorch padding=#

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With