I want to replace the rgb values of a numpy array to single integer representations. My code works but it's too slow, I am iterating over every element right now. Can I speed this up? I am new to numpy.
from skimage import io
# dictionary of color codes for my rgb values
_color_codes = {
(255, 200, 100): 1,
(223, 219, 212): 2,
...
}
# get the corresponding color code for the rgb vector supplied
def replace_rgb_val(rgb_v):
rgb_triple = (rgb_v[0], rgb_v[1], rgb_v[2])
if rgb_triple in _color_codes:
return _color_codes[rgb_triple]
else:
return -1
# function to replace, this is where I iterate
def img_array_to_single_val(arr):
return np.array([[replace_rgb_val(arr[i][j]) for j in range(arr.shape[1])] for i in range(arr.shape[0])])
# my images are square so the shape of the array is (n,n,3)
# I want to change the arrays to (n,n,1)
img_arr = io.imread(filename)
# this takes from ~5-10 seconds, too slow!
result = img_array_to_single_val(img_arr)
Replace the color values the other way round. Look for each RGB-triple, and set the corresponding index in a new array:
def img_array_to_single_val(arr, color_codes):
result = numpy.ndarray(shape=arr.shape[:2], dtype=int)
result[:,:] = -1
for rgb, idx in color_codes.items():
result[(arr==rgb).all(2)] = idx
return result
Let's take the color-index assignment apart: First arr==rgb
compares each pixel-rgb-values with the list rgb
, leading to a n x n x 3 - boolean array. Only if all three color-parts are the same, we found a match, so .all(2)
reduces the last axis, leeding to a n x n - boolean array, with True
for every pixel matching rgb
. Last step is, to use this mask to set the index of the corresponding pixels.
Even faster, it might be, to first convert the RGB-array to int32, and then do the index translation:
def img_array_to_single_val(image, color_codes):
image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
result = numpy.ndarray(shape=image.shape, dtype=int)
result[:,:] = -1
for rgb, idx in color_codes.items():
rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
result[arr==rgb] = idx
return result
For really large or many images you should first create a direct color mapping:
color_map = numpy.ndarray(shape=(256*256*256), dtype='int32')
color_map[:] = -1
for rgb, idx in color_codes.items():
rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
color_map[rgb] = idx
def img_array_to_single_val(image, color_map):
image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
return color_map[image]
Two fully vectorized solutions could be suggested here.
Approach #1: Using NumPy's powerful broadcasting capability
-
# Extract color codes and their IDs from input dict
colors = np.array(_color_codes.keys())
color_ids = np.array(_color_codes.values())
# Initialize output array
result = np.empty((img_arr.shape[0],img_arr.shape[1]),dtype=int)
result[:] = -1
# Finally get the matches and accordingly set result locations
# to their respective color IDs
R,C,D = np.where((img_arr == colors[:,None,None,:]).all(3))
result[C,D] = color_ids[R]
Approach #2: Using cdist from scipy.spatial.distance
one can replace the final steps from approach #1
, like so -
from scipy.spatial.distance import cdist
R,C = np.where(cdist(img_arr.reshape(-1,3),colors)==0)
result.ravel()[R] = color_ids[C]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With