I'm trying to understand the difference between these two ways of loading images from bytes using PIL vs OpenCV.
def bytes_to_ndarray(bytes):
bytes_io = bytearray(bytes)
img = Image.open(BytesIO(bytes_io))
return np.array(img)
and
img = cv2.imdecode(bytes, cv2.IMREAD_ANYCOLOR)
The problem is that they seem to give different answers for image that was created using OpenCV. If image
is an ndarray
, then for
bytes = cv2.imencode('.jpg', image)
these two ways will give different outputs, for example for skimage.data.astronaut()
PIL will give:
Whereas OpenCV will return correct image:
OpenCV is written in C and C++ whereas PIL is written using Python and C, hence just from this information, OpenCV seems faster. While dealing with 1000s of images for data extraction, the processing speed 🚀 matters.
Most of cases, cv2 method is faster than pil one.
Earlier, there was only cv . Later, OpenCV came with both cv and cv2 . Now, there in the latest releases, there is only the cv2 module, and cv is a subclass inside cv2 . You need to call import cv2.cv as cv to access it.)
Python cv2. imdecode() function is used to read image data from a memory cache and convert it into image format. This is generally used for loading the image efficiently from the internet.
In short: It's just the usual RGB vs. BGR ordering thing - but, the combination of how you use OpenCV's imencode
and imdecode
here with this specific image, makes everything totally complicated. ;-)
skimage.data.astronaut()
returns a ndarray
with RGB ordering, as RGB ordering is the standard in skimage
. In contrast, OpenCV internally uses BGR ordering. So, when we would use cv2.imread
on a saved PNG of this image, we would get a ndarray
with BGR ordering. Also, OpenCV always assumes BGR ordered ndarrays
for all its operations.
Now, you use cv2.imencode
to generate the byte stream. As mentioned, OpenCV assumes the ndarray
fed to that function has BGR ordering. That's important, because the generated byte stream will have RGB ordering (cv2.imencode
mimics cv2.imwrite
, and OpenCV correctly writes RGB images). So, the created byte stream has a false BGR ordering.
For the decoding, Pillow as well as OpenCV assume a RGB ordered byte stream. So, the ndarray
created by the "Pillow way" actually has BGR ordering (which is NOT the Pillow standard) and the ndarray
created by OpenCV's imdecode
has RGB ordering (which is NOT the OpenCV standard).
Finally, Matplotlib's (or pyplot's) imshow
assumes RGB ordered ndarrays
for visualization. So, the following will happen:
ndarray
from skimage.data.astronaut()
should be correct (RGB ordered).Let's see:
import cv2
from io import BytesIO
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import skimage
def bytes_to_ndarray(bytes):
bytes_io = bytearray(bytes)
img = Image.open(BytesIO(bytes_io))
return np.array(img)
# skimage returns a ndarray with RGB ordering
img_sk = skimage.data.astronaut()
# Opening a saved PNG file of this image using Pillow returns a ndarray with RGB ordering
img_pil = Image.open('astronaut.png')
# Opening a saved PNG file of this image using OpenCV returns a ndarray with BGR ordering
img_cv = cv2.imread('astronaut.png', cv2.IMREAD_COLOR)
# OpenCV uses BGR ordering, thus OpenCV's encoding treats img_sk[:, :, 0] as blue channel,
# although it's the actual red channel (the same for img_sk[:, :, 2]
# That means, the encoded byte stream now has BGR ordering!!
_, bytes = cv2.imencode('.png', img_sk)
# OpenCV uses BGR ordering, but OpenCV's decoding assumes a RGB ordered byte stream, so
# the blue and red channels are swapped again here, such that img_cv again is a ndarray with
# RGB ordering!!
img_byte_cv = cv2.imdecode(bytes, cv2.IMREAD_ANYCOLOR)
# Pillow uses RGB ordering, and also assumes a RGB ordered byte stream, but the actual byte
# stream is BGR ordered, such that img_pil actually is a ndarray with BGR ordering
img_byte_pil = bytes_to_ndarray(bytes)
# Matplotlib pyplot imshow uses RGB ordering for visualization!!
plt.figure(figsize=(8, 12))
plt.subplot(3, 2, 1), plt.imshow(img_pil), plt.ylabel('PNG loaded with Pillow')
plt.subplot(3, 2, 2), plt.imshow(img_cv), plt.ylabel('PNG loaded with OpenCV')
plt.subplot(3, 2, 3), plt.imshow(img_sk), plt.ylabel('Loaded with skimage')
plt.subplot(3, 2, 5), plt.imshow(img_byte_pil), plt.ylabel('Decoded with Pillow')
plt.subplot(3, 2, 6), plt.imshow(img_byte_cv), plt.ylabel('Decoded with OpenCV')
plt.show()
Et voilà :
Here's a PNG copy of the image to reproduce the code:
Bottom line: When using OpenCV's imencode
, make sure, that the passed ndarray
has BGR ordering!
Hope that helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With