Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how does data clustering help in image or pattern recognition

I have been playing around with different data clustering algorithms working on finding clusters between random data points represented an nodes, I keep reading that data clustering is used for image recognition. I am failing to make the connection, how does clustering data help in recognizing an image or in facial recognition. can someone explain this?

like image 995
anon Avatar asked Sep 17 '09 01:09

anon


People also ask

How clustering is used in pattern recognition?

Clustering methods simply try to group similar patterns into clusters whose members are more similar to each other (according to some distance measure) than to members of other clusters. There is no a priori knowledge of patterns that belong to certain groups, or even how many groups are appropriate.

How clustering is used in image processing?

Clustering is a powerful technique that has been reached in image segmentation. The cluster analysis is to partition an image data set into a number of disjoint groups or clusters. The clustering methods such as k means, improved k mean, fuzzy c mean (FCM) and improved fuzzy c mean algorithm (IFCM) have been proposed.

How clustering can be useful for image segmentation?

Subtractive clustering method is data clustering method where it generates the centroid based on the potential value of the data points. So subtractive cluster is used to generate the initial centers and these centers are used in k-means algorithm for the segmentation of image.

What is the purpose of data clustering?

The goal of clustering is to find distinct groups or “clusters” within a data set. Using a machine language algorithm, the tool creates groups where items in a similar group will, in general, have similar characteristics to each other.


1 Answers

It's no surprise that clustering is used for pattern recognition at large, and image recognition in particular: clustering is a reducing process, and images in this megapixel era need boiling down... It is also a process which produces categories and that is of course useful.

However there are many approaches to the use of clustering as a technique for image recognition. One of the reasons for this diversity is that clustering can be applied at different level, for different purposes: from basic pixel level to feature level (feature be a line, a geometric figure...), for classification or for other purposes.

At a very high level, clustering is a statistical tool, it helps discovering the relative importance of various dimensions in defining the belonging of particular item to a particular category.

One [of many] usage[s] of such a tool, is with supervised learning, whereby a set of human-selected items (say images) are fed into the cluster-based logic, along with a label associated with a particular item ("this is an apple", "this is another apple", "this is a lemon"...), the clustering logic then determines how much each dimension of the input matters for helping each group of items (apples, lemons...) fit in a distinct cluster (for example the color may matter relatively little, but the shape, or the presence of dots, or whatever may matter a lot). After this training phase, new images can be fed to the logic and by seeing how close to a particular cluster this image falls, it is "recognized" (as a banana!).

When it comes to image processing one needs to remember that whatever is "fed" to the clustering logic is not necessarily (in fact, rarely) the raw pixels, but various "objects" characterizing various "elements" of the original data (essentially a collection of relatively high dimension vectors, not unlike some that one may have encountered in other other data clustering examples), and produced by previous stages of the process. For example a important element of facial recognition is probably the exact distance between the center of the eyes. In previous stages, the image is processed in a way that figures out where the eyes are (possibly relying on another clustering-based logic). Then the distance between the eyes, along with many other elements are fed to the final clustering logic.

The preceding description is only one example of the use of clustering for image recognition. Indeed, various forms of neural networks have been used, very successfully, in this domain, and it can be argued that in a sense these neural networks are clustering information. One of the reasons for the success of neural nets may lie in their ability to be more respectful of the locality dimension as found in the original input, and also their ability to work in a hierarchical fashion.

A good conclusion to this write up would be a short list of online resources, but I'm pressed for time at the moment... "to be continued" ;-)

Next day edit: (failed attempt to provide an introductory online bibliography on the subject)

My search for literature on the topic of clustering as applied to artificial vision and image processing revealed two distinct... clusters ;-)

  • Books such as Algorithms for image processing and computer vision J Parkey pub Wiley, or Machine Vision : Theory, Algorithms, Practicalities M Seul et. Al Cambridge UP. Such books generally cover the all important techniques associated with noise reduction, Edge detection, Color or intensity conversion, and many other elements of the image processing chain, most of which do not involve clustering or even statistical methods, and they reserve only a chapter or two, or even minor mentions, to clustering, as applied to pattern recognition or to other tasks.
  • Scholarly papers and conference handbooks, which specifically cover clustering techniques applied to artificial vision and such, but in the narrowest and deepest fashion (ex: Variations on the Fukunaga and Narendra algorithm, for applications in character recognition, or Fast methods for selections of Nearest Neighbor candidates in whatever context.)

In short I feel ill equipped to make any specific book or article suggestion.

You may find it informative to browse titles in say Google books, keying in by "Artificial vision" or "Image Recognition" or some or the titles mentioned above. With the preview feature and also the tag cloud (btw another application of clustering) found in the "about this book" link, one can get a good idea of the various books contents and maybe decide to purchase some of them. Unfortunately the reduced readership and the potentially lucrative applications in the field make these books relatively expensive. At the other end of the spectrum, you may download, sometimes for free, research papers discussing advanced topics in the field. These will also show up on regular (web) Google, or at specialized repositories such as CiteSeer.

Good luck with your exploration in that field!

like image 94
mjv Avatar answered Sep 29 '22 11:09

mjv