I'm working on a reinforcement learning model implemented with Keras and Tensorflow. I have to do frequent calls to model.predict() on single inputs.
While testing inference on a simple pretrained model, I noticed that using Keras' model.predict is WAY slower than just using Numpy on stored weights. Why is it that slow and how can I accelerate it? Using pure Numpy is not viable for complex models.
import timeit
import numpy as np
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
w = np.array([[-1., 1., 0., 0.], [0., 0., -1., 1.]]).T
b = np.array([ 15., -15., -21., 21.])
model = Sequential()
model.add(Dense(4, input_dim=2, activation='linear'))
model.layers[0].set_weights([w.T, b])
model.compile(loss='mse', optimizer='adam')
state = np.array([-23.5, 17.8])
def predict_very_slow():
return model.predict(state[np.newaxis])[0]
def predict_slow():
ws = model.layers[0].get_weights()
return np.matmul(ws[0].T, state) + ws[1]
def predict_fast():
return np.matmul(w, state) + b
print(
timeit.timeit(predict_very_slow, number=10000),
timeit.timeit(predict_slow, number=10000),
timeit.timeit(predict_fast, number=10000)
)
# 5.168972805004538 1.6963867129435828 0.021918574168087623
# 5.461319456664639 1.5491559107269515 0.021502970783442876
This means that Keras is slower and lower in performance when compared to TensorFlow. However, Keras is more popular in terms of popularity, while TensorFlow is the second most popular. Keras is written most heavily in Python.
PyTorch is faster than Keras. Because Keras provides an additional layer of abstraction between the user and TensorFlow, it will always be innately slower and less scalable.
A little late, but maybe useful for someone:
Replace model.predict(X)
with model.predict(X, batch_size=len(X))
That should do it.
Are you running your Keras model (with TensorFlow backend) in a loop? If so, Keras has a memory leak issue identified here: LINK
In this case you have to import the following:
import keras.backend.tensorflow_backend
import tensorflow as tf
from keras.backend import clear_session
Finally, you have to put the following at the end of every iteration of a loop after you're done doing your computations:
clear_session()
if keras.backend.tensorflow_backend._SESSION:
tf.reset_default_graph()
keras.backend.tensorflow_backend._SESSION.close()
keras.backend.tensorflow_backend._SESSION = None
This should help you free up memory at the end of every loop and eventually, make the process faster. I hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With