Resnet50 produces different prediction when image loading and resizing is done with OpenCV

Question

I want to use Keras Resnet50 model using OpenCV for reading and resizing the input image. I'm using the same preprocessing code from Keras (with OpenCV I need to convert to RGB since this is the format expected by preprocess_input()). I get slightly different predictions using OpenCV and Keras image loading. I don't understand why the predictions are not the same.

Here is my code:

import numpy as np
import json
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import cv2

model = ResNet50(weights='imagenet')

img_path = '/home/me/squirle.jpg'

# Keras prediction
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted Keras:', decode_predictions(preds, top=3)[0])

# OpenCV prediction
imgcv = cv2.imread(img_path)
dim = (224, 224)
imgcv_resized = cv2.resize(imgcv, dim, interpolation=cv2.INTER_LINEAR)
x = cv2.cvtColor(imgcv_resized , cv2.COLOR_BGR2RGB)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted OpenCV:', decode_predictions(preds, top=3)[0])

Predicted Keras: [('n02490219', 'marmoset', 0.28250763), ('n02356798', 'fox_squirrel', 0.25657368), ('n02494079', 'squirrel_monkey', 0.19992349)]
Predicted OpenCV: [('n02356798', 'fox_squirrel', 0.5161952), ('n02490219', 'marmoset', 0.21953616), ('n02494079', 'squirrel_monkey', 0.1160824)]

How can I use OpenCV imread() and resize() to get the same prediction as Keras image loading?

Timbus Calin · Accepted Answer

# Keras prediction
img = image.load_img(img_path, target_size=(224, 224))

   # OpenCV prediction
imgcv = cv2.imread(img_path)
dim = (224, 224)
imgcv_resized = cv2.resize(imgcv, dim, interpolation=cv2.INTER_LINEAR)

If you look attentively, the interpolation you specify in the case of cv2 is cv2.INTER_LINEAR (bilinear interpolation); however, by default, image.load_img() uses an INTER_NEAREST interpolation method.
img_to_array(img). The dtype argument here is: None

Default to None, in which case the global setting tf.keras.backend.floatx() is used (unless you changed it, it defaults to "float32")

Therefore, in img_to_array(img) you have an image that consists of float32 values, while the cv2.imread(img) returns a numpy array of uint8 values.

Ensure you convert to RGB from BGR, as OpenCV loads directly into BGR format. You can use image = image[:,:,::-1] or image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB); otherwise you will have the R and B channels reversed resulting in an incorrect comparison.

Since the preprocessing that you apply is the same in both cases, the only differences are the ones that I mentioned above; adapting those changes should ensure reproducibility.

There is one observation I would like to make: provided that one uses a library (cv2 in this case) which automatically (and arguably only loads ints) instead of floats, the only correct way is to cast the first prediction array (Keras) to uint8 because by casting the latter to float32, the possible difference in information is lost. For example, with cv2 you load to uint8, and by casting instead of 233 you get 233.0. However, maybe the initial pixel value was 233,3 but this was lost due to the first conversion.

Resnet50 produces different prediction when image loading and resizing is done with OpenCV

Tags:

python

opencv

machine-learning

tensorflow

keras

Joe

1 Answers

Timbus Calin

Recent Activity

Donate For Us

Resnet50 produces different prediction when image loading and resizing is done with OpenCV

Tags:

python

opencv

machine-learning

tensorflow

keras

Joe

1 Answers

Timbus Calin

Related questions

Recent Activity

Donate For Us