I'm doing some data analysis of images. For some analysis I would like to convert the image's pixels in to HSV from RGB in which they are originally stored.
At the moment I'm using this code:
def generate_hsv(im):
coords = product(range(im.shape[0]), range(im.shape[1]))
num_cores = multiprocessing.cpu_count()
m = Parallel(n_jobs=num_cores)(delayed(process_pixels)(im[i]) for i in coords)
return np.array(m).reshape(im.shape)
Where process_pixels is just a wrapper for my conversion function:
def process_pixels(pixel):
return rgb_to_hsv(pixel[0], pixel[1], pixel[2])
The thing is it runs sluggishly.
Is there a more efficient way to do this? Or a better way to parallelize?
As Warren Weckesser said, the conversion function is problematic. I ended up using matplotlib:
matplotlib.colors.rgb_to_hsv(arr)
It now runs a million times faster.
Colorsys module has its implementation for each pixel with the input being expected as (R,G,B). Now, colorsys's implementation is listed below -
def rgb_to_hsv(r, g, b):
maxc = max(r, g, b)
minc = min(r, g, b)
v = maxc
if minc == maxc:
return 0.0, 0.0, v
s = (maxc-minc) / maxc
rc = (maxc-r) / (maxc-minc)
gc = (maxc-g) / (maxc-minc)
bc = (maxc-b) / (maxc-minc)
if r == maxc:
h = bc-gc
elif g == maxc:
h = 2.0+rc-bc
else:
h = 4.0+gc-rc
h = (h/6.0) % 1.0
return h, s, v
I have gone in with the assumption that the image being read is in (B,G,R) format, as is done with OpenCV's cv2.imread. So, let's vectorize the above mentioned function so that we could work with all pixels in a vectorized fashion. For vectorization, the usually preferred method is with broadcasting. So, with it, a vectorized implementation of rgb_to_hsv would look something like this (please notice how corresponding parts from the loopy code are transferred here) -
def rgb_to_hsv_vectorized(img): # img with BGR format
maxc = img.max(-1)
minc = img.min(-1)
out = np.zeros(img.shape)
out[:,:,2] = maxc
out[:,:,1] = (maxc-minc) / maxc
divs = (maxc[...,None] - img)/ ((maxc-minc)[...,None])
cond1 = divs[...,0] - divs[...,1]
cond2 = 2.0 + divs[...,2] - divs[...,0]
h = 4.0 + divs[...,1] - divs[...,2]
h[img[...,2]==maxc] = cond1[img[...,2]==maxc]
h[img[...,1]==maxc] = cond2[img[...,1]==maxc]
out[:,:,0] = (h/6.0) % 1.0
out[minc == maxc,:2] = 0
return out
Runtime test
Let's time it for a standard RGB image of size (256,256) and to create that let's use random numbers in [0,255].
Here's a typical way to use colorsys's rgb_to_hsv on an image of pixels :
def rgb_to_hsv_loopy(img):
out_loopy = np.zeros(img.shape)
for i in range(img.shape[0]):
for j in range(img.shape[1]):
out_loopy[i,j] = colorsys.rgb_to_hsv(img[i,j,2],img[i,j,1],img[i,j,0])
return out_loopy
As alternatives, there are also matplotlib's and OpenCV's color converion versions, but they seem to produce different results. For the sake of timings, let's include them anyway.
In [69]: img = np.random.randint(0,255,(256,256,3)).astype('uint8')
In [70]: %timeit rgb_to_hsv_loopy(img)
1 loops, best of 3: 987 ms per loop
In [71]: %timeit matplotlib.colors.rgb_to_hsv(img)
10 loops, best of 3: 22.7 ms per loop
In [72]: %timeit cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
1000 loops, best of 3: 1.23 ms per loop
In [73]: %timeit rgb_to_hsv_vectorized(img)
100 loops, best of 3: 13.4 ms per loop
In [74]: np.allclose(rgb_to_hsv_vectorized(img),rgb_to_hsv_loopy(img))
Out[74]: True # Making sure vectorized version replicates intended behavior
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With