Tensorflow: Finetune pretrained model on new dataset with different number of classes

Tags:

tensorflow

How can I finetune a pretrained model in tensorflow on a new dataset? In Caffe I can simply rename the last layer and set some parameters for random initialization. Is something similar possible in tensorflow?

Say I have a checkpoint file (deeplab_resnet.ckpt) and some code that sets up the computational graph in which I can modify the last layer such that it has the same number of ouputs as the new dataset has classes.

Then I try to start the session like this:

sess = tf.Session(config=config)
init = tf.initialize_all_variables()

sess.run(init)

trainable = tf.trainable_variables()
saver = tf.train.Saver(var_list=trainable, max_to_keep=40)
saver.restore(sess, 'ckpt_path/deeplab_resnet.ckpt')

However this gives me an error when calling the saver.restore function since it expects the exact same graph structure as the the one it was saved from. How can I only load all weights except for the last layer from the 'ckpt_path/deeplab_resnet.ckpt' file? I also tried changing the Classification layer name but no luck there either...

I'm using the tensorflow-deeplab-resnet model

659

asked Jan 19 '17 19:01

mcExchange

1 Answers

You can specify the names of the variables that you want to restore.

So, you can get a list of all of the variables in the model and filter out the variables of the last layer:

all_vars = tf.all_variables()
var_to_restore = [v for v in all_vars if not v.name.startswith('xxx')]

saver = tf.train.Saver(var_to_restore)

See the documentation for the details.

Alternatively, you can try to load the whole model an create a new "branch" out of the layer before the last and use it in the cost function during the training.

answered Sep 29 '22 11:09

Alexey Romanov

Related questions
                            
                                Convert a KerasTensor object to a numpy array to visualize predictions in Callback
                            
                                How to force tensorflow tensors to be symmetric?
                            
                                Saving and reading variable size list from TFRecord
                            
                                sparse autoencoder cost function in tensorflow
                            
                                Prediction from model saved with `tf.estimator.Estimator` in Tensorflow
                            
                                Relationship between tensorflow saver, exporter and save model
                            
                                TensorFlow efficient shared memory allocation for recursive concatenation
                            
                                Why tensorflow GPU memory usage decreasing when I increasing the batch size?
                            
                                How to extract cell state from a LSTM at each timestep in Keras?
                            
                                Tensorflow command tf.test.is_gpu_available() returns False
                            
                                Tensorflow 2.0: Accessing a batch's tensors from a callback
                            
                                TypeError: can't pickle _thread.RLock objects
                            
                                Anaconda Integration with Cuda 9.0 shows Incompatible Package Error
                            
                                Multilabel image classification with sparse labels in TensorFlow?
                            
                                Tensorflow, try and except doesn't handle exception
                            
                                How to use numpy functions on a keras tensor in the loss function?
                            
                                TensorFlow Horovod: NCCL and MPI
                            
                                How to invoke the Flex delegate for tflite interpreters?
                            
                                Issue with embedding layer when serving a Tensorflow/Keras model with TF 2.0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With