Following the upgrade to Keras 2.0.9, I have been using the <code>multi_gpu_model</code> utility but I can't save my models or best weights using <pre class="prettyprint"><code>model.save('path') </code></pre> The error I get is <blockquote> TypeError: can’t pickle module objects </blockquote> I suspect there is some problem gaining access to the model object. Is there a work around this issue?

To be honest, the easiest approach to this is to actually examine the multi gpu parallel model using <pre class="prettyprint"><code> parallel_model.summary() </code></pre> (The parallel model is simply the model after applying the multi_gpu function). This clearly highlights the actual model (in I think the penultimate layer - I am not at my computer right now). Then you can use the name of this layer to save the model. <pre class="prettyprint"><code> model = parallel_model.get_layer('sequential_1) </code></pre> Often its called sequential_1 but if you are using a published architecture, it may be 'googlenet' or 'alexnet'. You will see the name of the layer from the summary. Then its simple to just save <pre class="prettyprint"><code> model.save() </code></pre> Maxims approach works, but its overkill I think. Rem: you will need to compile both the model, and the parallel model.

Can not save model using model.save following multi_gpu_model in Keras

Tags:

tensorflow

distributed-computing

keras

multi-gpu

keras-2

Following the upgrade to Keras 2.0.9, I have been using the multi_gpu_model utility but I can't save my models or best weights using

model.save('path')

The error I get is

TypeError: can’t pickle module objects

I suspect there is some problem gaining access to the model object. Is there a work around this issue?

321

asked Nov 09 '17 20:11

GhostRider

Video Answer

2 Answers

To be honest, the easiest approach to this is to actually examine the multi gpu parallel model using

 parallel_model.summary()

(The parallel model is simply the model after applying the multi_gpu function). This clearly highlights the actual model (in I think the penultimate layer - I am not at my computer right now). Then you can use the name of this layer to save the model.

 model = parallel_model.get_layer('sequential_1)

Often its called sequential_1 but if you are using a published architecture, it may be 'googlenet' or 'alexnet'. You will see the name of the layer from the summary.

Then its simple to just save

 model.save()

Maxims approach works, but its overkill I think.

Rem: you will need to compile both the model, and the parallel model.

answered Oct 07 '22 05:10

GhostRider

Workaround

Here's a patched version that doesn't fail while saving:

from keras.layers import Lambda, concatenate
from keras import Model
import tensorflow as tf

def multi_gpu_model(model, gpus):
  if isinstance(gpus, (list, tuple)):
    num_gpus = len(gpus)
    target_gpu_ids = gpus
  else:
    num_gpus = gpus
    target_gpu_ids = range(num_gpus)

  def get_slice(data, i, parts):
    shape = tf.shape(data)
    batch_size = shape[:1]
    input_shape = shape[1:]
    step = batch_size // parts
    if i == num_gpus - 1:
      size = batch_size - step * i
    else:
      size = step
    size = tf.concat([size, input_shape], axis=0)
    stride = tf.concat([step, input_shape * 0], axis=0)
    start = stride * i
    return tf.slice(data, start, size)

  all_outputs = []
  for i in range(len(model.outputs)):
    all_outputs.append([])

  # Place a copy of the model on each GPU,
  # each getting a slice of the inputs.
  for i, gpu_id in enumerate(target_gpu_ids):
    with tf.device('/gpu:%d' % gpu_id):
      with tf.name_scope('replica_%d' % gpu_id):
        inputs = []
        # Retrieve a slice of the input.
        for x in model.inputs:
          input_shape = tuple(x.get_shape().as_list())[1:]
          slice_i = Lambda(get_slice,
                           output_shape=input_shape,
                           arguments={'i': i,
                                      'parts': num_gpus})(x)
          inputs.append(slice_i)

        # Apply model on slice
        # (creating a model replica on the target device).
        outputs = model(inputs)
        if not isinstance(outputs, list):
          outputs = [outputs]

        # Save the outputs for merging back together later.
        for o in range(len(outputs)):
          all_outputs[o].append(outputs[o])

  # Merge outputs on CPU.
  with tf.device('/cpu:0'):
    merged = []
    for name, outputs in zip(model.output_names, all_outputs):
      merged.append(concatenate(outputs,
                                axis=0, name=name))
    return Model(model.inputs, merged)

You can use this multi_gpu_model function, until the bug is fixed in keras. Also, when loading the model, it's important to provide the tensorflow module object:

model = load_model('multi_gpu_model.h5', {'tf': tf})

How it works

The problem is with import tensorflow line in the middle of multi_gpu_model:

def multi_gpu_model(model, gpus):
  ...
  import tensorflow as tf
  ...

This creates a closure for the get_slice lambda function, which includes the number of gpus (that's ok) and tensorflow module (not ok). Model save tries to serialize all layers, including the ones that call get_slice and fails exactly because tf is in the closure.

The solution is to move import out of multi_gpu_model, so that tf becomes a global object, though still needed for get_slice to work. This fixes the problem of saving, but in loading one has to provide tf explicitly.

answered Oct 07 '22 06:10

Maxim

Related questions
                            
                                TF data API: how to efficiently sample small patches from images
                            
                                unable to use Trained Tensorflow model
                            
                                Using feed_dict is more than 5x faster than using dataset API?
                            
                                Keras: "must compile model before using it" despite compile() is used
                            
                                How can I implement dilated convolution in keras?
                            
                                Convert image from float64 to uint8 makes the image look darker
                            
                                What is the difference between concatenate and add in keras?
                            
                                Could not load library cudnn_ops_infer64_8.dll. Error code 126 Please make sure cudnn_ops_infer64_8.dll is in your library path
                            
                                How to directly write to summary which mimics scalar_summary?
                            
                                Using Keras, how can I input an X_train of images (more than a thousand images)?
                            
                                Tensorflow indicator matrix for top n values
                            
                                TensorBoard can not read summaries on Google Cloud Storage
                            
                                Keras Image data generator throwing no files found error?
                            
                                In Neural Networks: accuracy improvement after each epoch is GREATER than accuracy improvement after each batch. Why?
                            
                                TypeError: List of Tensors when single Tensor expected - when using const with tf.random_normal
                            
                                How to check if dlib is using GPU or not?
                            
                                Why does sigmoid & crossentropy of Keras/tensorflow have low precision?
                            
                                AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
                            
                                How to plot confusion matrix for prefetched dataset in Tensorflow
                            
                                TensorFlow - object detection module, error appear when trying to use protoc

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can not save model using model.save following multi_gpu_model in Keras

Tags:

tensorflow

distributed-computing

keras

multi-gpu

keras-2

GhostRider

People also ask

Video Answer

2 Answers

GhostRider

Workaround

How it works

Maxim

Recent Activity

Donate For Us