Load saved checkpoint and predict not producing same results as in training

Tags:

I'm training based on a sample code I found on the Internet. The accuracy in testing is at 92% and the checkpoints are saved in a directory. In parallel (the training is running for 3 days now) I want to create my prediction code so I can learn more instead of just waiting.

This is my third day of deep learning so I probably don't know what I'm doing. Here's how I'm trying to predict:

Instantiate the model using the same code as in training
Load the last checkpoint
Try to predict

The code works but the results are nowhere near 90%.

Here's how I create the model:

INPUT_LAYERS = 2
OUTPUT_LAYERS = 2
AMOUNT_OF_DROPOUT = 0.3
HIDDEN_SIZE = 700
INITIALIZATION = "he_normal"  # : Gaussian initialization scaled by fan_in (He et al., 2014)
CHARS = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .")

def generate_model(output_len, chars=None):
    """Generate the model"""
    print('Build model...')
    chars = chars or CHARS
    model = Sequential()
    # "Encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE
    # note: in a situation where your input sequences have a variable length,
    # use input_shape=(None, nb_feature).
    for layer_number in range(INPUT_LAYERS):
        model.add(recurrent.LSTM(HIDDEN_SIZE, input_shape=(None, len(chars)), init=INITIALIZATION,
                         return_sequences=layer_number + 1 < INPUT_LAYERS))
        model.add(Dropout(AMOUNT_OF_DROPOUT))
    # For the decoder's input, we repeat the encoded input for each time step
    model.add(RepeatVector(output_len))
    # The decoder RNN could be multiple layers stacked or a single layer
    for _ in range(OUTPUT_LAYERS):
        model.add(recurrent.LSTM(HIDDEN_SIZE, return_sequences=True, init=INITIALIZATION))
        model.add(Dropout(AMOUNT_OF_DROPOUT))

    # For each of step of the output sequence, decide which character should be chosen
    model.add(TimeDistributed(Dense(len(chars), init=INITIALIZATION)))
    model.add(Activation('softmax'))

    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In a separate file predict.py I import this method to create my model and try to predict:

...import code
model = generate_model(len(question), dataset['chars'])
model.load_weights('models/weights.204-0.20.hdf5')

def decode(pred):
    return character_table.decode(pred, calc_argmax=False)


x = np.zeros((1, len(question), len(dataset['chars'])))
for t, char in enumerate(question):
    x[0, t, character_table.char_indices[char]] = 1.

preds = model.predict_classes([x], verbose=0)[0]

print("======================================")
print(decode(preds))

I don't know what the problem is. I have about 90 checkpoints in my directory and I'm loading the last one based on accuracy. All of them saved by a ModelCheckpoint:

checkpoint = ModelCheckpoint(MODEL_CHECKPOINT_DIRECTORYNAME + '/' + MODEL_CHECKPOINT_FILENAME,
                         save_best_only=True)

I'm stuck. What am I doing wrong?

560

asked Sep 06 '17 12:09

Romeo Mihalcea

1 Answers

In the repo you provided, the training and validation sentences are inverted before being fed into the model (as commonly done in seq2seq learning).

dataset = DataSet(DATASET_FILENAME)

As you can see, the default value for inverted is True, and the questions are inverted.

class DataSet(object):
    def __init__(self, dataset_filename, test_set_fraction=0.1, inverted=True):
        self.inverted = inverted

    ...

        question = question[::-1] if self.inverted else question
        questions.append(question)

You can try to invert the sentences during prediction. Specifically,

x = np.zeros((1, len(question), len(dataset['chars'])))
for t, char in enumerate(question):
    x[0, len(question) - t - 1, character_table.char_indices[char]] = 1.

answered Oct 02 '22 23:10

Yu-Yang

Related questions
                            
                                How to profile django channels?
                            
                                Anaconda: any way to indicate if dependency issues prevent "conda update"ing the *absolute* latest version of a module?
                            
                                Available packages empty in Pycharm with Anaconda interpretter
                            
                                Kivy/Buildozer Import Error - pymssql.so is 64-bit instead of 32-bit
                            
                                Celery and RabbitMQ - queue priority vs. consumer priority vs. task priority
                            
                                Pylinter in Sublime text 3.1.1 still doesn't use Python2.7
                            
                                How do I run a single nosetest via setup.py in the python-active-directory module?
                            
                                How to add more metrics on the country_map in Apache-superset?
                            
                                How to solve view limit minimum is less than 1 and is an invalid Matplotlib date value error?
                            
                                Pandas v1.1.0: Groupby rolling count slower than rolling mean & sum
                            
                                Integrate Qt with Windows 7 taskbar using python?
                            
                                Phonon's VideoWidget show wrong colors on a QGLWidget (Qt, Python)
                            
                                Irregular, non-contiguous Periods in Pandas
                            
                                input() blocks other python processes in Windows 8 (python 3.3)
                            
                                Python: ImportError: No module named pkg_resources [duplicate]
                            
                                uWSGI / Flask / Python logs stop after some time
                            
                                How to write a proxy pool server (when a request comes, choose a proxy to get url content) in python?
                            
                                Sublime Text syntax: Python 3.6 f-strings
                            
                                TensorFlow: How can I evaluate a validation data queue multiple times during training?
                            
                                Decode Micro QR codes with Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Load saved checkpoint and predict not producing same results as in training

Tags:

python

tensorflow

deep-learning

keras

Romeo Mihalcea

People also ask

1 Answers

Yu-Yang

Recent Activity

Donate For Us