Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing RGB values in numpy array by integer is extremely slow

I want to replace the rgb values of a numpy array to single integer representations. My code works but it's too slow, I am iterating over every element right now. Can I speed this up? I am new to numpy.

from skimage import io

# dictionary of color codes for my rgb values
_color_codes = {
    (255, 200, 100): 1,
    (223, 219, 212): 2,
    ...
}

# get the corresponding color code for the rgb vector supplied
def replace_rgb_val(rgb_v):
    rgb_triple = (rgb_v[0], rgb_v[1], rgb_v[2])
    if rgb_triple in _color_codes:
        return _color_codes[rgb_triple]
    else:
        return -1

# function to replace, this is where I iterate
def img_array_to_single_val(arr):
    return np.array([[replace_rgb_val(arr[i][j]) for j in range(arr.shape[1])] for i in range(arr.shape[0])])


# my images are square so the shape of the array is (n,n,3)
# I want to change the arrays to (n,n,1)
img_arr = io.imread(filename)
# this takes from ~5-10 seconds, too slow!
result = img_array_to_single_val(img_arr)
like image 738
Gustavo Puma Avatar asked Dec 08 '22 01:12

Gustavo Puma


2 Answers

Replace the color values the other way round. Look for each RGB-triple, and set the corresponding index in a new array:

def img_array_to_single_val(arr, color_codes):
    result = numpy.ndarray(shape=arr.shape[:2], dtype=int)
    result[:,:] = -1
    for rgb, idx in color_codes.items():
        result[(arr==rgb).all(2)] = idx
    return result

Let's take the color-index assignment apart: First arr==rgb compares each pixel-rgb-values with the list rgb, leading to a n x n x 3 - boolean array. Only if all three color-parts are the same, we found a match, so .all(2) reduces the last axis, leeding to a n x n - boolean array, with True for every pixel matching rgb. Last step is, to use this mask to set the index of the corresponding pixels.

Even faster, it might be, to first convert the RGB-array to int32, and then do the index translation:

def img_array_to_single_val(image, color_codes):
    image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
    result = numpy.ndarray(shape=image.shape, dtype=int)
    result[:,:] = -1
    for rgb, idx in color_codes.items():
        rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
        result[arr==rgb] = idx
    return result

For really large or many images you should first create a direct color mapping:

color_map = numpy.ndarray(shape=(256*256*256), dtype='int32')
color_map[:] = -1
for rgb, idx in color_codes.items():
    rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
    color_map[rgb] = idx

def img_array_to_single_val(image, color_map):
    image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
    return color_map[image]
like image 194
Daniel Avatar answered Feb 04 '23 00:02

Daniel


Two fully vectorized solutions could be suggested here.

Approach #1: Using NumPy's powerful broadcasting capability -

# Extract color codes and their IDs from input dict
colors = np.array(_color_codes.keys())
color_ids = np.array(_color_codes.values())

# Initialize output array
result = np.empty((img_arr.shape[0],img_arr.shape[1]),dtype=int)
result[:] = -1

# Finally get the matches and accordingly set result locations
# to their respective color IDs
R,C,D = np.where((img_arr == colors[:,None,None,:]).all(3))
result[C,D] = color_ids[R]

Approach #2: Using cdist from scipy.spatial.distance one can replace the final steps from approach #1, like so -

from scipy.spatial.distance import cdist

R,C = np.where(cdist(img_arr.reshape(-1,3),colors)==0)
result.ravel()[R] = color_ids[C]
like image 42
Divakar Avatar answered Feb 04 '23 00:02

Divakar