Unable to save model with tensorflow 2.0.0 beta1

Tags:

I have tried all the options described in the documentation but none of them allowed me to save my model in tensorflow 2.0.0 beta1. I've also tried to upgrade to the (also unstable) TF2-RC but that ruined even the code I had working in beta so I quickly rolled back for now to beta.

See a minimal reproduction code below.

What I have tried:

```
model.save("mymodel.h5") 
```

NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using save_weights.

```
model.save("mymodel", format='tf')
```

ValueError: Model <main.CVAE object at 0x7f1cac2e7c50> cannot be saved because the input shapes have not been set. Usually, input shapes are automatically determined from calling .fit() or .predict(). To manually set the shapes, call model._set_inputs(inputs).

model._set_input(input_sample)
model.save("mymodel", format='tf')

AssertionError: tf.saved_model.save is not supported inside a traced @tf.function. Move the call to the outer eagerly-executed context.

And this is where I am stuck now because it gives me no reasonable hint whatsoever. That's because I am NOT calling the save() function from a @tf.function, I'm already calling it from the outermost scope possible. In fact, I have no @tf.function at all in this minimal reproduction script below and still getting the same error.

So I really have no idea how to save my model, I've tried every options and they all throw errors and provide no hints.

The minimal reproduction example below works fine if you set save_model=False and it reproduces the error when save_model=True.

It may seem unnecessary in this simplified auto-encoder code example to use a subclassed model but I have lots of custom functions added to it in my original VAE code that I need it for.

Code:

import tensorflow as tf

save_model = True

learning_rate = 1e-4
BATCH_SIZE = 100
TEST_BATCH_SIZE = 10
color_channels = 1
imsize = 28

(train_images, _), (test_images, _) = tf.keras.datasets.mnist.load_data()

train_images = train_images[:5000, ::]
test_images = train_images[:1000, ::]
train_images = train_images.reshape(-1, imsize, imsize, 1).astype('float32')
test_images = test_images.reshape(-1, imsize, imsize, 1).astype('float32')
train_images /= 255.
test_images /= 255.
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).batch(BATCH_SIZE)
test_dataset = tf.data.Dataset.from_tensor_slices(test_images).batch(TEST_BATCH_SIZE)

class AE(tf.keras.Model):
    def __init__(self):
        super(AE, self).__init__()
        self.network = tf.keras.Sequential([
            tf.keras.layers.InputLayer(input_shape=(imsize, imsize, color_channels)),
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(50),
            tf.keras.layers.Dense(imsize**2 * color_channels),
            tf.keras.layers.Reshape(target_shape=(imsize, imsize, color_channels)),
        ])
    def decode(self, input):
        logits = self.network(input)
        return logits

optimizer = tf.keras.optimizers.Adam(learning_rate)
model = AE()

def compute_loss(data):
    logits = model.decode(data)
    loss = tf.reduce_mean(tf.losses.mean_squared_error(logits, data))
    return loss

def train_step(data):
    with tf.GradientTape() as tape:
        loss = compute_loss(data)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    return loss, 0

def test_step(data):
    loss = compute_loss(data)
    return loss

input_shape_set = False
epoch = 0
epochs = 20
for epoch in range(epochs):
    for train_x in train_dataset:
        train_step(train_x)
    if epoch % 1 == 0:
        loss = 0.0
        num_batches = 0
        for test_x in test_dataset:
            loss += test_step(test_x)
            num_batches += 1
        loss /= num_batches
        print("Epoch: {}, Loss: {}".format(epoch, loss))

        if save_model:
            print("Saving model...")
            if not input_shape_set:
                # Note: Why set input shape manually and why here:
                # 1. If I do not set input shape manually: ValueError: Model <main.CVAE object at 0x7f1cac2e7c50> cannot be saved because the input shapes have not been set. Usually, input shapes are automatically determined from calling .fit() or .predict(). To manually set the shapes, call model._set_inputs(inputs).
                # 2. If I set input shape manually BEFORE the first actual train step, I get: RuntimeError: Attempting to capture an EagerTensor without building a function.
                model._set_inputs(train_dataset.__iter__().next())
                input_shape_set = True
            # Note: Why choose tf format: model.save('MNIST/Models/model.h5') will return NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using save_weights.
            model.save('MNIST/Models/model', save_format='tf')

705

asked Aug 30 '19 01:08

Kristof

1 Answers

I have tried the same minimal reproduction example in tensorflow-gpu 2.0.0-rc0 and the error was more revealing than what the beta version gave me. The error in RC says:

NotImplementedError: When subclassing the Model class, you should implement a call method.

This got me read through https://www.tensorflow.org/beta/guide/keras/custom_layers_and_models where I found examples of how to do subclassing in TF2 in a way that allows saving. I was able to resolve the error and have the model saved by replacing my 'decode' method by 'call' in the above example (although this will be more complicated with my actual code where I had various methods defined for the class). This solved the error both in beta and in rc. Strangely, the training (or the saving) got also much faster in rc.

answered Nov 15 '22 09:11

Kristof

Related questions
                            
                                tensorflow neural network with 3d mesh as input
                            
                                How can I handle TensorFlow sessions to train multiple Keras models at the same time?
                            
                                What is the meaning of the implementation of the KL divergence in Keras?
                            
                                Reloading Keras Tokenizer during Testing
                            
                                How to approximate the determinant with keras
                            
                                Speeding up inference of Keras models
                            
                                How could I limit the range of a variable in tensorflow
                            
                                Why Tensorflow Object Detection disable regularization for Faster R-CNN
                            
                                Is there R command(s) making Keras Tensorflow-GPU to run on CPU?
                            
                                Autoencoder loss is not decreasing (and starts very high)
                            
                                keras LSTM feeding input with the right shape
                            
                                Convolutional neural network architectures with an arbitrary number of input channels (more than RGB)
                            
                                Can neuroevolution of augmenting topologies (NEAT) neural networks be built in TensorFlow?
                            
                                new shape and old shape must have the same number of elements
                            
                                How to efficiently extract all slices of given length using tensorflow
                            
                                Tensorflow memory leak when building graph in a loop
                            
                                What is the sequence for tensorflow's session to run a list of tensors?
                            
                                AWS Sagemaker - ClientError: Data download failed
                            
                                A simple case of Graph visualization in TensorFlow 2.0
                            
                                How to provide multiple targets to a Seq2Seq model?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unable to save model with tensorflow 2.0.0 beta1

Tags:

neural-network

tensorflow

model

keras

Kristof

People also ask

1 Answers

Kristof

Recent Activity

Donate For Us