Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to speed up color-clustering in openCV?

for a project I want to implement a color-clustering algorithm, which replace similar colors with the average color of a cluster.

For now, I use the kmeans-algorithm to cluster the whole image . But this take's a long time. Has someone an idea how to use kmeans to cluster a color-histogram , so I can perform this algorithm?

like image 226
501 - not implemented Avatar asked Nov 28 '12 20:11

501 - not implemented


People also ask

How can I improve my clustering performance?

Graph-based clustering performance can easily be improved by applying ICA blind source separation during the graph Laplacian embedding step. Applying unsupervised feature learning to input data using either RICA or SFT, improves clustering performance.

Why K means clustering is fast?

k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since k-means depends mainly on distance calculation between all data points and the centers, the time cost will be high when the size of the dataset is large (for example more than 500millions of points).

Why do we use K means clustering for color quantization?

Performs a pixel-wise Vector Quantization (VQ) of an image of the summer palace (China), reducing the number of colors required to show the image from 96,615 unique colors to 64, while preserving the overall appearance quality.


1 Answers

Downsample the image first, then run k-means.

If you resize the image to 1/2th in both x and y, it shouldn't affect colors much, but k-means should take at most 1/4th of the time. If you resample to 1/10 of the width and height, k-means should run 100 times faster.

https://en.wikipedia.org/wiki/Color_quantization

By downsampling the image, you have less "pixels" to process during clustering. But in the end, it should produce roughly the same color scheme.

Small summary of k-means:

  1. It maps each object (=pixel) to the nearest cluster center (= palette entry)
  2. It recomputes each palette entry to best represent the assigned points (= pixels)
  3. Repeat until nothing changes anymore.

So the real output is not an image or image regions. It's the palette.

You can then map an arbitrary image (including the full resolution version) to this color palette by simply replacing each pixel with the closest color!

Complexity and performance:

The complexity of k-means is O(n*k*i), where n is the number of pixels you have, k the desired number of output colors and i the number of iterations needed until convergence.

n: by downsampling, you can easily reduce n, the largest factor. In many situations, you can reduce this quite significantly before you see a degradation in performance.

k: this is your desired number of output colors. Whether you can reduce this or not depends on your actual use case.

i: various factors can have an effect on convergence (including both other factors!), but the strongest probably is having good starting values. So if you have a very fast but low quality method to choose the palette, run it first, then use k-means to refine this palette. Maybe OpenCV already includes an appropriate heuristic for this though!

You can see, the easiest approach is to reduce n. You can reduce n significantly, produce an optimized palette for the thumbnail, then rerun k-means on the full image refinining this palette. As - hopefully - this will reduce the number of iterations significantly, this can sometimes perform very well.

like image 78
Has QUIT--Anony-Mousse Avatar answered Oct 08 '22 20:10

Has QUIT--Anony-Mousse