Replacing RGB values in numpy array by integer is extremely slow

Question

I want to replace the rgb values of a numpy array to single integer representations. My code works but it's too slow, I am iterating over every element right now. Can I speed this up? I am new to numpy.

from skimage import io

# dictionary of color codes for my rgb values
_color_codes = {
    (255, 200, 100): 1,
    (223, 219, 212): 2,
    ...
}

# get the corresponding color code for the rgb vector supplied
def replace_rgb_val(rgb_v):
    rgb_triple = (rgb_v[0], rgb_v[1], rgb_v[2])
    if rgb_triple in _color_codes:
        return _color_codes[rgb_triple]
    else:
        return -1

# function to replace, this is where I iterate
def img_array_to_single_val(arr):
    return np.array([[replace_rgb_val(arr[i][j]) for j in range(arr.shape[1])] for i in range(arr.shape[0])])


# my images are square so the shape of the array is (n,n,3)
# I want to change the arrays to (n,n,1)
img_arr = io.imread(filename)
# this takes from ~5-10 seconds, too slow!
result = img_array_to_single_val(img_arr)

Daniel · Accepted Answer

Replace the color values the other way round. Look for each RGB-triple, and set the corresponding index in a new array:

def img_array_to_single_val(arr, color_codes):
    result = numpy.ndarray(shape=arr.shape[:2], dtype=int)
    result[:,:] = -1
    for rgb, idx in color_codes.items():
        result[(arr==rgb).all(2)] = idx
    return result

Let's take the color-index assignment apart: First arr==rgb compares each pixel-rgb-values with the list rgb, leading to a n x n x 3 - boolean array. Only if all three color-parts are the same, we found a match, so .all(2) reduces the last axis, leeding to a n x n - boolean array, with True for every pixel matching rgb. Last step is, to use this mask to set the index of the corresponding pixels.

Even faster, it might be, to first convert the RGB-array to int32, and then do the index translation:

def img_array_to_single_val(image, color_codes):
    image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
    result = numpy.ndarray(shape=image.shape, dtype=int)
    result[:,:] = -1
    for rgb, idx in color_codes.items():
        rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
        result[arr==rgb] = idx
    return result

For really large or many images you should first create a direct color mapping:

color_map = numpy.ndarray(shape=(256*256*256), dtype='int32')
color_map[:] = -1
for rgb, idx in color_codes.items():
    rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
    color_map[rgb] = idx

def img_array_to_single_val(image, color_map):
    image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
    return color_map[image]

Divakar · Answer

Two fully vectorized solutions could be suggested here.

Approach #1: Using NumPy's powerful broadcasting capability -

# Extract color codes and their IDs from input dict
colors = np.array(_color_codes.keys())
color_ids = np.array(_color_codes.values())

# Initialize output array
result = np.empty((img_arr.shape[0],img_arr.shape[1]),dtype=int)
result[:] = -1

# Finally get the matches and accordingly set result locations
# to their respective color IDs
R,C,D = np.where((img_arr == colors[:,None,None,:]).all(3))
result[C,D] = color_ids[R]

Approach #2: Using cdist from scipy.spatial.distance one can replace the final steps from approach #1, like so -

from scipy.spatial.distance import cdist

R,C = np.where(cdist(img_arr.reshape(-1,3),colors)==0)
result.ravel()[R] = color_ids[C]

Replacing RGB values in numpy array by integer is extremely slow

Tags:

python

arrays

image-processing

numpy

rgb

Gustavo Puma

2 Answers

Daniel

Divakar

Recent Activity

Donate For Us

Replacing RGB values in numpy array by integer is extremely slow

Tags:

python

arrays

image-processing

numpy

rgb

Gustavo Puma

2 Answers

Daniel

Divakar

Related questions

Recent Activity

Donate For Us