Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between K-means clustering and vector quantization?

What is the difference between K-means clustering and vector quantization?

They seem to be very similar.

I'm dealing with Hidden Markov Models and I need to extract symbols from feature vectors.

In order to extract symbols, do I do vector quantization or k-means clustering?

like image 759
garak Avatar asked Jan 26 '12 01:01

garak


People also ask

What is meant by vector quantization?

Vector quantization, also called "block quantization" or "pattern matching quantization" is often used in lossy data compression. It works by encoding values from a multidimensional vector space into a finite set of values from a discrete subspace of lower dimension.

Is K-means algorithm and k-means clustering same?

K-means clustering algorithm computes the centroids and iterates until we it finds optimal centroid. It assumes that the number of clusters are already known. It is also called flat clustering algorithm. The number of clusters identified from data by algorithm is represented by 'K' in K-means.

What is the difference between vector quantization and scalar quantization?

Vector quantization is a lossy compression technique used in speech and image coding. In scalar quantization, a scalar value is selected from a finite list of possible values to represent a sample.

What is k-means clustering in simple words?

K-means clustering aims to partition data into k clusters in a way that data points in the same cluster are similar and data points in the different clusters are farther apart. Similarity of two points is determined by the distance between them. There are many methods to measure the distance.


2 Answers

The way I understand it, K-means is one type of vector quantization.

like image 125
Rob Neuhaus Avatar answered Oct 17 '22 22:10

Rob Neuhaus


The K-means algorithms is the specialization of the celebrated "Lloyd I" quantization algorithm to the case of empirical distributions. (cf. Lloyd)

The Lloyd I algorithm is proved to yield a sequence of quantizers with a decreasing quadratic distortion. However, except in the special case of one-dimensional log-concave distributions, it dos not always converge to a quadratic optimal quantizer. (There are local minimums for the quantization error, especially when dealing with empirical distribution i.e. for the clustering problem.)

A method that converges (always) toward an optimal quantizer is the so-called CLVQ algorithms, which also generalizes to the problem of more general L^p quantization. It is a kind of Stochastic Gradient method. (cf. Pagès)

There are also some approaches based on genetic algorithms. (cf. Hamida et al.), and/or classical optimization procedures for the one dimensional case that converge faster (Pagès, Printems).

like image 5
Quantize Avatar answered Oct 17 '22 22:10

Quantize