Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert Tensorflow model to Caffe model

I would like to be able to convert a Tensorflow model to Caffe model.

I searched on google but I was able to find only converters from caffe to tensorflow but not the opposite.

Does anyone have an idea on how to do it?

Thanks, Evi

like image 575
Evi Avatar asked Dec 14 '16 09:12

Evi


People also ask

How do you convert TensorFlow model to TFLite?

The TensorFlow Lite converter takes a TensorFlow model and generates a TensorFlow Lite model (an optimized FlatBuffer format identified by the . tflite file extension). You can load a SavedModel or directly convert a model you create in code.

How do I export a TFLite model?

2.1 Export frozen inference graph for TFLite After training the model, you need to export the model so that the graph architecture and network operations are compatible with Tensorflow Lite. This can be done with the export_tflite_sdd_graph.py file inside the object_detection directory.

What is a TFLite file?

TF lite model is a special format model efficient in terms of accuracy and also is a light-weight version that will occupy less space, these features make TF Lite models the right fit to work on Mobile and Embedded Devices. TensorFlow Lite conversion Process. Source:https://www.tensorflow.org/lite/convert/index.


2 Answers

I've had the same problem and found a solution. The code can be found here (https://github.com/lFatality/tensorflow2caffe) and I've also documented the code in some Youtube videos.


Part 1 covers the creation of the architecture of VGG-19 in Caffe and tflearn (higher level API for TensorFlow, with some changes to the code native TensorFlow should also work).


In Part 2 the export of the weights and biases out of the TensorFlow model into a numpy file is described. In tflearn you can get the weights of a layer like this:

#get parameters of a certain layer conv2d_vars = tflearn.variables.get_layer_variables_by_name(layer_name) #get weights out of the parameters weights = model.get_weights(conv2d_vars[0]) #get biases out of the parameters biases = model.get_weights(conv2d_vars[1]) 

For a convolutional layer, the layer_name is Conv_2D. Fully-Connected layers are called FullyConnected. If you use more than one layer of a certain type, a raising integer with a preceding underscore is used (e.g. the 2nd conv layer is called Conv_2D_1). I've found these names in the graph of the TensorBoard. If you name the layers in your architecture definition, then these layer_names might change to the names you defined.

In native TensorFlow the export will need different code but the format of the parameters should be the same so subsequent steps should still be applicable.


Part 3 covers the actual conversion. What's critical is the conversion of the weights when you create the caffemodel (the biases can be carried over without change). TensorFlow and Caffe use different formats when saving a filter. While TensorFlow uses [height, width, depth, number of filters] (TensorFlow docs, at the bottom), Caffe uses [number of filters, depth, height, width] (Caffe docs, chapter 'Blob storage and communication'). To convert between the formats you can use the transpose function (for example: weights_of_first_conv_layer.transpose((3,2,0,1)). The 3,2,0,1 sequence can be obtained by enumerating the TensorFlow format (origin) and then switching it to the Caffe format (target format) while keeping the numbers at their specific variable.).
If you want to connect a tensor output to a fully-connected layer, things get a little tricky. If you use VGG-19 with an input size of 112x112 it looks like this.

fc1_weights = data_file[16][0].reshape((4,4,512,4096)) fc1_weights = fc1_w.transpose((3,2,0,1)) fc1_weights = fc1_w.reshape((4096,8192)) 

What you get from TensorFlow if you export the parameters at the connection between tensor and fully-connected layer is an array with the shape [entries in the tensor, units in the fc-layer] (here: [8192, 4096]). You have to find out what the shape of your output tensor is and then reshape the array so that it fits the TensorFlow format (see above, number of filters being the number of units in the fc-layer). After that you use the transpose-conversion you've used previously and then reshape the array again, but the other way around. While TensorFlow saves fc-layer weights as [number of inputs, number of outputs], Caffe does it the other way around.
If you connect two fc-layers to each other, you don't have to do the complex process previously described but you will have to account for the different fc-layer format by transposing again (fc_layer_weights.transpose((1,0)))

You can then set the parameters of the network using

net.params['layer_name_in_prototxt'][0].data[...] = weights net.params['layer_name_in_prototxt'][1].data[...] = biases 

This was a quick overview. If you want all the code, it's in my github repository. I hope it helps. :)


Cheers,
Fatality

like image 182
Fatality Avatar answered Sep 19 '22 14:09

Fatality


As suggested in the comment by @Patwie, you have to do it manually by copying the weights layer by layer. For example, to copy the first conv layer weights from a tensorflow checkpoint to a caffemodel, you have to do something like following:

sess = tf.Session() new_saver = tf.train.import_meta_graph("/path/to/checkpoint.meta") what = new_saver.restore(sess, "/path/to/checkpoint")  all_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)  conv1 = all_vars[0] bias1 = all_vars[1]  conv_w1, bias_1 = sess.run([conv1,bias1])  net = caffe.Net('path/to/conv.prototxt', caffe.TEST)  net.params['conv_1'][0].data[...] = conv_w1 net.params['conv_1'][1].data[...] = bias_1  ...  net.save('modelfromtf.caffemodel') 

Note1: This code has NOT been tested. I am not sure if this will work, but I think it should. Also, this is for one conv layer, only. In practice, you have to first analyse your tensorflow checkpoint to check which layer weights are at which index(print all_vars) and then copy each layer's weights individually.

Note2: Some automation can be done by iterating over the initial conv layers as they generally follow a set pattern (conv1->bn1->relu1->conv2->bn2->relu2...)

Note3: Tensorflow may further divide each layer weights into separate indices. For example: weights and biases are separated for a conv layer as shown above. Also, gamma, mean and variance are separated for batch normalisation layer.

like image 24
Jayant Agrawal Avatar answered Sep 20 '22 14:09

Jayant Agrawal