I am considering to use OpenCV's Kmeans implementation since it says to be faster...
Now I am using package cv2 and function kmeans,
I can not understand the parameters' description in their reference:
Python: cv2.kmeans(data, K, criteria, attempts, flags[, bestLabels[, centers]]) → retval, bestLabels, centers
samples – Floating-point matrix of input samples, one row per sample.
clusterCount – Number of clusters to split the set by.
labels – Input/output integer array that stores the cluster indices for every sample.
criteria – The algorithm termination criteria, that is, the maximum number of iterations and/or the desired accuracy. The accuracy is specified as criteria.epsilon. As soon as each of the cluster centers moves by less than criteria.epsilon on some iteration, the algorithm stops.
attempts – Flag to specify the number of times the algorithm is executed using different initial labelings. The algorithm returns the labels that yield the best compactness (see the last function parameter).
flags –
Flag that can take the following values:
KMEANS_RANDOM_CENTERS Select random initial centers in each attempt.
KMEANS_PP_CENTERS Use kmeans++ center initialization by Arthur and Vassilvitskii [Arthur2007].
KMEANS_USE_INITIAL_LABELS During the first (and possibly the only) attempt, use the user-supplied labels instead of computing them from the initial centers. For the second and further attempts, use the random or semi-random centers. Use one of KMEANS_*_CENTERS flag to specify the exact method.
centers – Output matrix of the cluster centers, one row per each cluster center.
what is the argument flags[, bestLabels[, centers]])
mean? and what about his one: → retval, bestLabels, centers
?
Here's my code:
import cv, cv2
import scipy.io
import numpy
# read data from .mat file
mat = scipy.io.loadmat('...')
keys = mat.keys()
values = mat.viewvalues()
data_1 = mat[keys[0]]
nRows = data_1.shape[1]
nCols = data_1.shape[0]
samples = cv.CreateMat(nRows, nCols, cv.CV_32FC1)
labels = cv.CreateMat(nRows, 1, cv.CV_32SC1)
centers = cv.CreateMat(nRows, 100, cv.CV_32FC1)
#centers = numpy.
for i in range(0, nCols):
for j in range(0, nRows):
samples[j, i] = data_1[i, j]
cv2.kmeans(data_1.transpose,
100,
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 0.1, 10),
attempts=cv2.KMEANS_PP_CENTERS,
flags=cv2.KMEANS_PP_CENTERS,
)
And I encounter such error:
flags=cv2.KMEANS_PP_CENTERS,
TypeError: <unknown> is not a numpy array
How should I understand the parameter list and the usage of cv2.kmeans? Thanks
the documentation on this function is almost impossible to find. I wrote the following Python code in a bit of a hurry, but it works on my machine. It generates two multi-variate Gaussian Distributions with different means and then classifies them using cv2.kmeans(). You may refer to this blog post to get some idea of the parameters.
Handle imports:
import cv
import cv2
import numpy as np
import numpy.random as r
Generate some random points and shape them appropriately:
samples = cv.CreateMat(50, 2, cv.CV_32FC1)
random_points = r.multivariate_normal((100,100), np.array([[150,400],[150,150]]), size=(25))
random_points_2 = r.multivariate_normal((300,300), np.array([[150,400],[150,150]]), size=(25))
samples_list = np.append(random_points, random_points_2).reshape(50,2)
random_points_list = np.array(samples_list, np.float32)
samples = cv.fromarray(random_points_list)
Plot the points before and after classification:
blank_image = np.zeros((400,400,3))
blank_image_classified = np.zeros((400,400,3))
for point in random_points_list:
cv2.circle(blank_image, (int(point[0]),int(point[1])), 1, (0,255,0),-1)
temp, classified_points, means = cv2.kmeans(data=np.asarray(samples), K=2, bestLabels=None,
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 1, 10), attempts=1,
flags=cv2.KMEANS_RANDOM_CENTERS) #Let OpenCV choose random centers for the clusters
for point, allocation in zip(random_points_list, classified_points):
if allocation == 0:
color = (255,0,0)
elif allocation == 1:
color = (0,0,255)
cv2.circle(blank_image_classified, (int(point[0]),int(point[1])), 1, color,-1)
cv2.imshow("Points", blank_image)
cv2.imshow("Points Classified", blank_image_classified)
cv2.waitKey()
Here you can see the original points:
Here are the points after they have been classified:
I hope that this answer may help you, it is not a complete guide to k-means, but it will at least show you how to pass the parameters to OpenCV.
The problem here is your data_1.transpose
is not a numpy array.
OpenCV 2.3.1 and higher python bindings do not take anything except numpy array
as image/array parameters. so, data_1.transpose
has to be a numpy array.
Generally, all the points in OpenCV are of type numpy.ndarray
eg.
array([[[100., 433.]],
[[157., 377.]],
.
.
[[147., 247.]], dtype=float32)
where each element of array is
array([[100., 433.]], dtype=float32)
and the element of that array is
array([100., 433.], dtype=float32)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With