I want to use pillow for some simple handwritten image recognition, and it will be real-time so I will need to call my function 5-10 times a second. I'm loading the image and am only accessing 1 in 20^2 pixels so I really don't need all the image. I need to reduce the image loading time.
I've never used a python image library and would appreciate all suggestions.
from PIL import Image
import time
start = time.time()
im = Image.open('ir/IMG-1949.JPG')
width, height = im.size
px = im.load()
print("loading: ", time.time() - start)
desired loading time: <50ms, actual loading time: ~150ms
OpenCV is written in C and C++ whereas PIL is written using Python and C, hence just from this information, OpenCV seems faster.
To load the image, we simply import the image module from the pillow and call the Image. open(), passing the image filename. Instead of calling the Pillow module, we will call the PIL module as to make it backward compatible with an older module called Python Imaging Library (PIL).
First, there was PIL (Python Image Library). And then its development was abandoned. Then, Pillow forked PIL as a drop-in replacement and according to its benchmarks it is significantly faster than ImageMagick , OpenCV , IPP and other fast image processing libraries (on identical hardware/platform).
In general, Pillow-SIMD with AVX2 is always 16 to 40 times faster than ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium.
Updated Answer
Since I wrote this answer, John Cupitt (author of pyvips
) has come up with some improvements and corrections and fairer code and timings and has kindly shared them here. Please look at his improved version, alongside or even in preference to my code below.
Original Answer
The JPEG library has a "shrink-on-load" feature which allows a lot of I/O and decompression to be avoided. You can take advantage of this with PIL/Pillow using the Image.draft()
function, so instead of reading the full 4032x3024 pixels like this:
from PIL import Image
im = Image.open('image.jpg')
px = im.load()
which takes 297ms on my Mac, you can do the following and read 1008x756 pixels, i.e. 1/4 the width and 1/4 the height:
im = Image.open('image.jpg')
im.draft('RGB',(1008,756))
px = im.load()
and that takes only 75ms, i.e. it is 4x faster.
Just for kicks, I tried comparing various techniques as follows:
#!/usr/bin/env python3
import numpy as np
import pyvips
import cv2
from PIL import Image
def usingPIL(f):
im = Image.open(f)
return np.asarray(im)
def usingOpenCV(f):
arr = cv2.imread(f,cv2.IMREAD_UNCHANGED)
return arr
def usingVIPS(f):
image = pyvips.Image.new_from_file(f, access="sequential")
mem_img = image.write_to_memory()
imgnp=np.frombuffer(mem_img, dtype=np.uint8).reshape(image.height, image.width, 3)
return imgnp
def usingPILandShrink(f):
im = Image.open(f)
im.draft('RGB',(1008,756))
return np.asarray(im)
def usingVIPSandShrink(f):
image = pyvips.Image.new_from_file(f, access="sequential", shrink=4)
mem_img = image.write_to_memory()
imgnp=np.frombuffer(mem_img, dtype=np.uint8).reshape(image.height, image.width, 3)
return imgnp
And loaded that into ipython
and tested like this:
%timeit usingPIL('image.jpg')
315 ms ± 8.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit usingOpenCV('image.jpg')
102 ms ± 1.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit usingVIPS('image.jpg')
69.1 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit usingPILandShrink('image.jpg')
77.2 ms ± 994 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit usingVIPSandShrink('image.jpg')
42.9 ms ± 332 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
It seems like pyVIPS is the clear winner here!
Keywords: Python, PIL, Pillow, image, image processing, JPEG, shrink-on-load, shrink on load, draft mode, read performance, speedup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With