Loading trained Tensorflow model into estimator

Tags:

Say that I have trained a Tensorflow Estimator:

estimator = tf.contrib.learn.Estimator(
  model_fn=model_fn,
  model_dir=MODEL_DIR,
  config=some_config)

And I fit it to some train data:

estimator.fit(input_fn=input_fn_train, steps=None)

The idea is that a model is fit to my MODEL_DIR. This folder contains a checkpoint and several files of .meta and .index.

This works perfectly. I want to do some predictions using my functions:

estimator = tf.contrib.Estimator(
  model_fn=model_fn,
  model_dir=MODEL_DIR,
  config=some_config)

predictions = estimator.predict(input_fn=input_fn_test)

My solution works perfectly but there is one big disadvantage: you need to know model_fn, which is my model defined in Python. But if I change the model by adding a dense layer in my Python code, this model is incorrect for the saved data in MODEL_DIR, leading to incorrect results:

NotFoundError (see above for traceback): Key xxxx/dense/kernel not found in checkpoint

How do I cope with this? How can I load my model / estimator such that I can make predictions on some new data? How can I load model_fn or the estimator from MODEL_DIR?

504

asked Jan 10 '18 16:01

Guido

1 Answers

Avoiding a bad restoration

Restoring a model's state from a checkpoint only works if the model and checkpoint are compatible. For example, suppose you trained a DNNClassifier Estimator containing two hidden layers, each having 10 nodes:

classifier = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[10, 10],
    n_classes=3,
    model_dir='models/iris')

classifier.train(
    input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100),
        steps=200)

After training (and, therefore, after creating checkpoints in models/iris), imagine that you changed the number of neurons in each hidden layer from 10 to 20 and then attempted to retrain the model:

classifier2 = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    hidden_units=[20, 20],  # Change the number of neurons in the model.
    n_classes=3,
    model_dir='models/iris')

classifier.train(
    input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100),
        steps=200)

Since the state in the checkpoint is incompatible with the model described in classifier2, retraining fails with the following error:

...
InvalidArgumentError (see above for traceback): tensor_name =
dnn/hiddenlayer_1/bias/t_0/Adagrad; shape in shape_and_slice spec [10]
does not match the shape stored in checkpoint: [20]

To run experiments in which you train and compare slightly different versions of a model, save a copy of the code that created each model_dir, possibly by creating a separate git branch for each version. This separation will keep your checkpoints recoverable.

copy from tensorflow checkpoints doc.

https://www.tensorflow.org/get_started/checkpoints

hope that can help you.

100

answered Oct 07 '22 17:10

Colin Wang

Related questions
                            
                                How to put a Python script in the background without pythonw.exe?
                            
                                Django authentication override not working
                            
                                Share an evolving dict between processes
                            
                                Querying Array data type in elasticsearch using python_dsl
                            
                                Embedded python code in c++ - error when importing python libraries
                            
                                ImportError: attempted relative import with no known parent package
                            
                                How to handle with words which have space between characters?
                            
                                Vectorizing the multivariate normal CDF (cumulative density function) in Python
                            
                                How can I be sure I installed pip correctly on Mac OSX?
                            
                                Is it possible to use a fixture in another fixture and both in a test?
                            
                                sqlalchemy.exc.InvalidRequestError: Multiple classes found for path in the registry of this declarative base
                            
                                Hash Checking in setup.py install requires
                            
                                Forward Kinematics for Baxter
                            
                                Pandas converting numbers to strings - unexpected results
                            
                                Uploading and processing a csv file in django using ModelForm
                            
                                Dealing with the "StanfordTokenizer will be deprecated in version 3.2.5" Warning [closed]
                            
                                Write test-cases for python setuptools entry-points plugins
                            
                                Scraping <td> values on table generate by Javascript to Python
                            
                                Pywinauto - Can't connect to office documents using the UIA backend
                            
                                Control tick-labels from multi-level FactorRange

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Loading trained Tensorflow model into estimator

Tags:

python

tensorflow

conv-neural-network

tensorflow-estimator

Guido

People also ask

1 Answers

Avoiding a bad restoration

Colin Wang

Recent Activity

Donate For Us