I have a python image processing function, that uses tries to get the dominant color of an image. I make use of a function I found here https://github.com/tarikd/python-kmeans-dominant-colors/blob/master/utils.py
It works, but unfortunately I don't quite understand what it does and I learned that np.histogram
is rather slow and I should use cv2.calcHist
since it's 40x faster according to this: https://docs.opencv.org/trunk/d1/db7/tutorial_py_histogram_begins.html
I'd like to understand how I have to update the code to use cv2.calcHist
, or better, which values I have to input.
My function
def centroid_histogram(clt):
# grab the number of different clusters and create a histogram
# based on the number of pixels assigned to each cluster
num_labels = np.arange(0, len(np.unique(clt.labels_)) + 1)
(hist, _) = np.histogram(clt.labels_, bins=num_labels)
# normalize the histogram, such that it sums to one
hist = hist.astype("float")
hist /= hist.sum()
# return the histogram
return hist
The pprint
of clt
is this, not sure if this helps
KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
n_clusters=1, n_init=10, n_jobs=1, precompute_distances='auto',
random_state=None, tol=0.0001, verbose=0)
My code can be found here: https://github.com/primus852/python-movie-barcode
I am a very beginner, so any help is highly appreciated.
As per request:
rgb(22,28,37)
0.021515369415283203s
For instance red is the dominant color of RGB(65, 0, 0).
The allele genes come in the form of brown, blue, or green, with brown being dominant, followed by green, and blue being the least dominant or what is called recessive.
Two approaches using np.unique
and np.bincount
to get the most dominant color could be suggested. Also, in the linked page, it talks about bincount
as a faster alternative, so that could be the way to go.
Approach #1
def unique_count_app(a):
colors, count = np.unique(a.reshape(-1,a.shape[-1]), axis=0, return_counts=True)
return colors[count.argmax()]
Approach #2
def bincount_app(a):
a2D = a.reshape(-1,a.shape[-1])
col_range = (256, 256, 256) # generically : a2D.max(0)+1
a1D = np.ravel_multi_index(a2D.T, col_range)
return np.unravel_index(np.bincount(a1D).argmax(), col_range)
Verification and timings on 1000 x 1000
color image in a dense range [0,9)
for reproducible results -
In [28]: np.random.seed(0)
...: a = np.random.randint(0,9,(1000,1000,3))
...:
...: print unique_count_app(a)
...: print bincount_app(a)
[4 7 2]
(4, 7, 2)
In [29]: %timeit unique_count_app(a)
1 loop, best of 3: 820 ms per loop
In [30]: %timeit bincount_app(a)
100 loops, best of 3: 11.7 ms per loop
Further boost
Further boost upon leveraging multi-core
with numexpr
module for large data -
import numexpr as ne
def bincount_numexpr_app(a):
a2D = a.reshape(-1,a.shape[-1])
col_range = (256, 256, 256) # generically : a2D.max(0)+1
eval_params = {'a0':a2D[:,0],'a1':a2D[:,1],'a2':a2D[:,2],
's0':col_range[0],'s1':col_range[1]}
a1D = ne.evaluate('a0*s0*s1+a1*s0+a2',eval_params)
return np.unravel_index(np.bincount(a1D).argmax(), col_range)
Timings -
In [90]: np.random.seed(0)
...: a = np.random.randint(0,9,(1000,1000,3))
In [91]: %timeit unique_count_app(a)
...: %timeit bincount_app(a)
...: %timeit bincount_numexpr_app(a)
1 loop, best of 3: 843 ms per loop
100 loops, best of 3: 12 ms per loop
100 loops, best of 3: 8.94 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With