Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras: convert pretrained weights between theano and tensorflow

I would like to use this pretrained model.

It is in theano layout, my code depends on tensorflow image dimension ordering.

There is a guide on converting weights between the formats.

But this seems broken. In the section to convert theano to tensorflow, the first instruction is to load the weights into the tensorflow model.

Keras backend should be TensorFlow in this case. First, load the Theano-trained weights into your TensorFlow model:

model.load_weights('my_weights_theano.h5')

This raises an exception, the weight layouts would be incompatible. And if the load_weights function would take theano weights for a tensorflow model, there wouldn't be any need to convert them.

I took a look at the convert_kernel function to see, whether I could do the necessary steps myself.

The code is rather simple - I don't understand why the guide makes use of a tensorflow session. That seems unnecessary.

I've copied the code from the pretrained model to create a model with tensorflow layer. This just meant changing the input shape and the backend.image_dim_ordering before adding any Convolutions. Then I used this loop:

model is the original model, created from the code I linked at the beginning. model_tensorflow is the exact same model, but with tensorflow layout.

for i in range(len(model.layers)):
    layer_theano=model.layers[i]
    layer_tensorflow=model_tensorflow.layers[i]

    if layer_theano.__class__.__name__ in ['Convolution1D', 'Convolution2D', 'Convolution3D', 'AtrousConvolution2D']:
        weights_theano=layer_theano.get_weights()

        kernel=weights_theano[0]
        bias=weights_theano[1]

        converted_kernel=convert_kernel(kernel, "th")
        converted_kernel=converted_kernel.transpose((3,2,1,0))

        weights_tensorflow=[converted_kernel, bias]

        layer_tensorflow.set_weights(weights_tensorflow)

    else:
        layer_tensorflow.set_weights(layer_theano.get_weights())

In the original code, there is a testcase: Prediction ran on the image of a cat. I've downloaded a cat image and tried the testcase with the original model: 285. The converted model predicts 585.

I don't know whether 285 is the correct label for a cat, but even if it isn't, the two models should be broken in the same way, I would expect the same prediction.

What is the correct way of converting weights between models ?

like image 742
lhk Avatar asked Feb 12 '17 21:02

lhk


1 Answers

You are right. The code is broken. As of now, there is a work around for this issue and the solution is described here.

I have tested it myself and it worked for me.

If you feel the answer is useful, please upvote it. Thanks!

like image 164
Tensorflow Support Avatar answered Oct 21 '22 19:10

Tensorflow Support