I implemented a classification program using keras. I have a big set of images and I would like to predict each image using a for loop.
However, every time a new image is computed the swap memory increases. I tried to delete all variables inside of the predict function (and I'm sure that it is inside of this function that there is a problem) but the memory still increases.
for img in images:
predict(img, model, categ_par, gl_par)
and the corresponding function:
def predict(image_path, model, categ_par, gl_par):
print("[INFO] loading and preprocessing image...")
orig = cv2.imread(image_path)
image = load_img(image_path, target_size=(gl_par.img_width, gl_par.img_height))
image = img_to_array(image)
# important! otherwise the predictions will be '0'
image = image / 255
image = np.expand_dims(image, axis=0)
# build the VGG16 network
if(categ_par.method == 'VGG16'):
model = applications.VGG16(include_top=False, weights='imagenet')
if(categ_par.method == 'InceptionV3'):
model = applications.InceptionV3(include_top=False, weights='imagenet')
# get the bottleneck prediction from the pre-trained VGG16 model
bottleneck_prediction = model.predict(image)
# build top model
model = Sequential()
model.add(Flatten(input_shape=bottleneck_prediction.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(categ_par.n_class, activation='softmax'))
model.load_weights(categ_par.top_model_weights_path)
# use the bottleneck prediction on the top model to get the final classification
class_predicted = model.predict_classes(bottleneck_prediction)
probability_predicted = (model.predict_proba(bottleneck_prediction))
classe = pd.DataFrame(list(zip(categ_par.class_indices.keys(), list(probability_predicted[0])))).\
rename(columns = {0:'type', 1: 'prob'}).reset_index(drop=True)
#print(classe)
del model
del bottleneck_prediction
del image
del orig
del class_predicted
del probability_predicted
return classe.set_index(['type']).T
If you are using TensorFlow backend you will be building a model for each img in the for loop. TensorFlow just keeps appending graph onto graph etc. which means memory just rises. This is a well known occurrence and must be dealt with during hyperparameter optimization when you will be building many models, but also here.
from keras import backend as K
and put this at the end of predict():
K.clear_session()
Or you can just build one model and feed that as input to the predict function so you are not building a new one each time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With