Reduce the size of Keras LSTM model

Essentially, I am training an LSTM model using Keras, but when I save it, its size takes up to 100MB. However, my purpose of the model is to deploy to a web server in order to serve as an API, my web server is not able to run it since the model size is too big. After analyzing all parameters in my model, I figured out that my model has 20,000,000 parameters but 15,000,000 parameters are untrained since they are word embeddings. Is there any way that I can minimize the size of the model by removing that 15,000,000 parameters but still preserving the performance of the model? Here is my code for the model:

def LSTModel(input_shape, word_to_vec_map, word_to_index):


    sentence_indices = Input(input_shape, dtype="int32")

    embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)


    embeddings = embedding_layer(sentence_indices)


    X = LSTM(256, return_sequences=True)(embeddings)
    X = Dropout(0.5)(X)
    X = LSTM(256, return_sequences=False)(X)
    X = Dropout(0.5)(X)    
    X = Dense(NUM_OF_LABELS)(X)
    X = Activation("softmax")(X)

    model = Model(inputs=sentence_indices, outputs=X)

    return model

What is LSTM output size?

LSTM Default return value: The size of output is 2D array of real numbers. The second dimension is the dimensionality of the output space defined by the units parameter in Keras LSTM implementation.

How many LSTM layers should I add?

The vanilla LSTM network has three layers; an input layer, a single hidden layer followed by a standard feedforward output layer.

How do I reduce the number of parameters in Keras?

A good method to drastically lower these parameters is to add: subsample=(2, 2) (careful it lowers the resolution of images/data) in all the Convolutional layers above that Flatten layer, if subsample doesn't work then it is stride=(2, 2) .

Define the layers you want to save outside the function and name them. Then create two functions foo() and bar(). foo() will have the original pipeline including the embedding layer. bar() will have only the part of pipeline AFTER embedding layer. Instead, you will define new Input() layer in bar() with dimensions of your embeddings:

lstm1 = LSTM(256, return_sequences=True, name='lstm1')
lstm2 = LSTM(256, return_sequences=False, name='lstm2')
dense = Dense(NUM_OF_LABELS, name='Susie Dense')

def foo(...):
    sentence_indices = Input(input_shape, dtype="int32")
    embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
    embeddings = embedding_layer(sentence_indices)
    X = lstm1(embeddings)
    X = Dropout(0.5)(X)
    X = lstm2(X)
    X = Dropout(0.5)(X)    
    X = dense(X)
    X = Activation("softmax")(X)
    return Model(inputs=sentence_indices, outputs=X)


def bar(...):
    embeddings = Input(embedding_shape, dtype="float32")
    X = lstm1(embeddings)
    X = Dropout(0.5)(X)
    X = lstm2(X)
    X = Dropout(0.5)(X)    
    X = dense(X)
    X = Activation("softmax")(X)
    return Model(inputs=sentence_indices, outputs=X)

foo_model = foo(...)
bar_model = bar(...)

foo_model.fit(...)
bar_model.save_weights(...)

Now, you will train the original foo() model. Then you can save the weights of the reduced bar() model. When loading the model, don't forget to specify by_name=True parameter:

foo_model.load_weights('bar_model.h5', by_name=True)

Reduce the size of Keras LSTM model

Tags:

python

machine-learning

keras

lstm

recurrent-neural-network

Anh Pham

People also ask

1 Answers

Aechlys

Recent Activity

Donate For Us

Reduce the size of Keras LSTM model

Tags:

python

machine-learning

keras

lstm

recurrent-neural-network

Anh Pham

People also ask

1 Answers

Aechlys

Related questions

Recent Activity

Donate For Us